Project- Stage 1

 


This blog is for the SPO600 fall project. In this project, we will work on GCC. For the stage one, we are focus on how to build the GCC including the command to build the GCC from source code, and the time to build the whole project. The GCC is a complier that can translate the C++ code into machine code. There are so many lines of code in the GCC source files. Today I am going to write a blog about my research of the GCC


 We are going to talk about the builde process of GCC. Unline many small program which just have hundreds or thousands lines of codes. GCC is a huge code base, hence it require a huge amount of time to complie it. First, we need to download the codebase into our machine, we can go to the website git under gcc/gnu to find the git command for downloading. The command is git clone git://gcc.gnu.org/git/gcc.git SomeLocalDir. After downloading the codebase we need to configure the build. First we need to create a seperate directory that the file will be built into. Then we will use the command ~/git/gcc/configure --prefix=$HOME/gcc-test-001 to run the configuration script in gcc to set up the make file in the directory that we have created. And then we use make to perform the build. By default, the make just finish one job at a time, but by using -j we can set up the task in parallel. We will also use the tee command to read the output and error into build log and time to calculate the time we used. Hence the final command would be time make -j 24 |& tee build.log. After finish this command, we will run make install to install the gcc. To check the builde, we will use make check under the build directory. If everying good, we can see in terminal that output by this command, like all passes and no failure. The time for building this gcc in my laptop took around 3 hours to finish it.


The next task is learning how to navigate the gcc code base. The code that control the compilation passes is under the gcc directory file passes.cc. In this file we can find multiple passes and the passes definition is in tree-passes.h. To create a test passes and add it into the gcc, we first need to create a c file like other passes file and put it in the gcc directory. In our test file, extern the passes like extern gimple_opt_pass *test_passes(void); And then we need to register our pass in the passes.cc file by using append_pass(&all_passes, test_passes());


For the next task, I will need to find out how to access the IR of the program being complied. For the project stage one, I will focus on  GIMPLE IR  since this is the primary IR used for most analysis. I will try to create  a pass to analyze how to function clones. For printing the clones of function I have created a function in my pass to analyze the clone function like 

/* Analyze function relationships */

void analyze_function_clones() {

    cgraph_node* node;

    FOR_EACH_FUNCTION_WITH_GIMPLE_BODY(node) {

        tree fndecl = node->decl;

        const char* fname = IDENTIFIER_POINTER(DECL_NAME(fndecl));


        /* Check if this function is a clone */

        if (DECL_ABSTRACT_ORIGIN(fndecl)) {

            tree origin = DECL_ABSTRACT_ORIGIN(fndecl);

            const char* origin_name = IDENTIFIER_POINTER(DECL_NAME(origin));


            /* Get or create clone info for the original function */

            clone_info* info = clone_map.get(origin_name);

            if (!info) {

                info = new clone_info;

                info->original_name = origin_name;

                info->clone_names.create(0);

                clone_map.put(origin_name, info);

            }


            /* Add this clone to the list */

            info->clone_names.safe_push(fname);

        }

    }

}

Then create a function to print the analysis. Like the last task, I have added this passes into the passess.cc files, 


The next task will be find out the dump system in GCC. GCC dump system allows detailed inspection of the program state during compilation. For the complier option -fdump-tree-all , it will dumps all GIMPLE/GENERIC tree representations at every pass. And for the -fdump-tree-[pass-name], it will dump the tree representation after a specific pass. For complier option -fdump-rtl-all, it will dump all RTL (register transform language) representation. Similar to fdump-tree-, the fdump-rtl- [name] will dump the specific pass in RTL. I have done some research for the dumps but I can't fully understand the dumps that it has produced. For example in trees pass, the original represent initial tree representation. ssa represent after conversion to SSA form. optimized represent the tree has been optimized.


I have found that it is quite challenging to understand the GCC. For the first part in this project which is leaning how to build the GCC, it is not that hard since I have experience of using make to build the c or c++ project. But for the second and third part, is it quite hard. Because I don't have any experience in developing such a huge project which require the pass to define the multiple stage in compilation process. Hence I don't know what is it and how to write a pass and add it into the GCC. Also, the there is no much resource for me to do the research about the GCC except the document. For the thrid part, the concept of dumps is also quite new to me. Now I know a little bit of how the huge c program will be complied and what it need to monitor the compilation.




评论

此博客中的热门博文

Project-stage 2

lab1