sim,python: follow the new CPython startup sequence

Currently, gem5 suffers from several bugs related
to Python interpreter's locale encoding issues.
gem5 will crash when the working directory contains
Non-ASCII characters.

The reason is that Python 3.8+ introduces a new
interpreter startup sequence [1]. The startup
sequence consists of three phases:

1. Python core runtime preinitialization
2. Python core runtime initialization
3. Main interpreter configuration

Stage 1 determining the encodings used for system
interfaces.

However, gem5 doesn't preinitialize the Python
interpreter. Thus, the locale settings do not take
effect. This patch preinitialize the Python for
Python 3.8+.

Also, this patch avoid the use of `Py_SetProgramName`,
which is deprecated since Python 3.11[3].

[1] https://peps.python.org/pep-0432/
[2] https://peps.python.org/pep-0587/
[3] https://docs.python.org/3/c-api/init.html#c.Py_SetProgramName

Change-Id: I08a2ec6ab2b39a95ab194909932c8fc578c745ce
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/70898
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Bobby Bruce <bbruce@ucdavis.edu>
Maintainer: Bobby Bruce <bbruce@ucdavis.edu>
Reviewed-by: Jason Lowe-Power <power.jg@gmail.com>
Maintainer: Jason Lowe-Power <power.jg@gmail.com>
Reviewed-by: Roger Chang <rogerycchang@google.com>
diff --git a/src/python/gem5py.cc b/src/python/gem5py.cc
index f2d8759..37ddee2 100644
--- a/src/python/gem5py.cc
+++ b/src/python/gem5py.cc
@@ -51,6 +51,21 @@
 int
 main(int argc, const char **argv)
 {
+#if PY_VERSION_HEX >= 0x03080000
+    // Preinitialize Python for Python 3.8+
+    // This ensures that the locale configuration takes effect
+    PyStatus status;
+    PyPreConfig preconfig;
+    PyPreConfig_InitPythonConfig(&preconfig);
+
+    preconfig.utf8_mode = 1;
+
+    status = Py_PreInitialize(&preconfig);
+    if (PyStatus_Exception(status)) {
+        Py_ExitStatusException(status);
+    }
+#endif
+
     py::scoped_interpreter guard;
 
     // Embedded python doesn't set up sys.argv, so we'll do that ourselves.
diff --git a/src/sim/main.cc b/src/sim/main.cc
index 81a691d..1c42891 100644
--- a/src/sim/main.cc
+++ b/src/sim/main.cc
@@ -50,6 +50,7 @@
     // Initialize gem5 special signal handling.
     initSignals();
 
+#if PY_VERSION_HEX < 0x03080000
     // Convert argv[0] to a wchar_t string, using python's locale and cleanup
     // functions.
     std::unique_ptr<wchar_t[], decltype(&PyMem_RawFree)> program(
@@ -59,6 +60,23 @@
     // This can help python find libraries at run time relative to this binary.
     // It's probably not necessary, but is mostly harmless and might be useful.
     Py_SetProgramName(program.get());
+#else
+    // Preinitialize Python for Python 3.8+
+    // This ensures that the locale configuration takes effect
+    PyStatus status;
+
+    PyConfig config;
+    PyConfig_InitPythonConfig(&config);
+
+    /* Set the program name. Implicitly preinitialize Python. */
+    status = PyConfig_SetBytesString(&config, &config.program_name,
+                                     argv[0]);
+    if (PyStatus_Exception(status)) {
+        PyConfig_Clear(&config);
+        Py_ExitStatusException(status);
+        return 1;
+    }
+#endif
 
     py::scoped_interpreter guard(true, argc, argv);