Changes between Version 1 and Version 2 of Island

Changes between Version 1 and Version 2 of Island

Please note that these Trac pages are no longer being updated. Wiki contents/documentation have moved to GitHub.

Changes between Version 1 and Version 2 of Island

Please note that these Trac pages are no longer being updated. Wiki contents/documentation have moved to GitHub.

Changes between Version 1 and Version 2 of Island

Show
Ignore:
Timestamp:
01/13/10 18:40:28 (10 years ago)
Author:
armon
Comment:

Added Examples

Legend:

Unmodified
Added
Removed
Modified
  • Island

    v1 v2  
    55This is a central component of CADET, since it is utilized to determine the performance characteristics of a program to optimize the host machine selection. 
    66 
     7[[BR]] 
    78---- 
    89[[TOC(inline)]] 
    910---- 
    10  
     11[[BR]] 
    1112== Components == 
    1213 
     
    1819 * prgm_stat.repy : Forces a normal repy program to run within an Island, and prints statistics upon termination. 
    1920 
    20  
     21[[BR]][[BR]] 
    2122== Island Class Implementation Details == 
    2223 
     
    4243logic on the various function calls. 
    4344 
    44  
     45[[BR]][[BR]] 
    4546== Island Statistics Implementation Details == 
    4647 
     
    7475equivalents. 
    7576 
    76  
     77[[BR]][[BR]] 
    7778== Stoptime Logger Module Implementation Details == 
    7879 
     
    9192number of entries logged. 
    9293 
    93  
     94[[BR]][[BR]] 
    9495== Program Stats Implementation Details == 
    9596 
     
    100101use the Island module, or any of its parsing routines. 
    101102 
    102  
    103  
    104  
    105  
    106  
    107  
    108  
    109  
    110  
    111  
    112  
    113  
    114  
    115  
    116  
     103[[BR]][[BR]] 
     104== Examples == 
     105[[BR]] 
     106=== Single threaded program === 
     107 
     108Here is a simple program, which only uses a single thread of execution: 
     109 
     110{{{ 
     111 
     112def write_data(): 
     113    lim, usage, stoptimes = getresources() 
     114    file_write_lim = lim["filewrite"] 
     115    file_write_lim *= 2 
     116 
     117    file_handle = open("test.out","w") 
     118    data = "*" * int(file_write_lim) 
     119    file_handle.write(data) 
     120    file_handle.close() 
     121 
     122def use_cpu(): 
     123    start = getruntime() 
     124    while getruntime() - start < 1: 
     125        for x in xrange(500): 
     126            val1 = x ** 2 
     127            val2 = x ** 3 
     128 
     129write_data() 
     130use_cpu() 
     131 
     132}}} 
     133 
     134The write_data() function determine what about of data can be written in a second, 
     135and tries to write twice that amount. This should cause the program to block for second, 
     136while we compensate for over-using our file write resource. 
     137 
     138The use_cpu() function performs some trivial computation in a loop until a second of  
     139wall time has elapsed. The only action the program takes is to first call write_data(), and 
     140then use_cpu(). 
     141 
     142I've named this program test_basic.repy, and ran it a 10% CPU restriction, 
     143allowing 100K bytes to be written per second, with the following command: 
     144 
     145{{{ 
     146$ python repy.py restrictions.test dylink.repy prgm_stat.repy test_basic.repy 
     147 
     148--- Island Summary --- 
     149 
     150Total threads: 1 
     151Live threads: 0 
     152Total files: 1 
     153Open files: 0 
     154Total sockets: 0 
     155Open sockets: 0 
     156 
     157--- Global Sum --- 
     158 
     159Total 'fileread': 0 
     160Total 'filesopened': 1 
     161Total 'tcpsend': 0 
     162Total 'random': 0 
     163Total 'insockets': 0 
     164Total 'filewrite': 200000 
     165Total 'events': 0 
     166Total 'udpsend': 0 
     167Total 'udprecv': 0 
     168Total 'tcprecv': 0 
     169Total 'outsockets': 0 
     170Total 'cpu': 0.182549 
     171 
     172--- Island Timeline --- 
     173 
     174TOC, Thread, API, Amount 
     1751.78870010376 MainThread island_create_thread  
     1761.7893280983 MainThread open 1 
     1771.78962397575 MainThread start call file.write 
     1782.79035711288 MainThread file.write 200000 
     1792.87065291405 ALL Stopped 0.704621213531 
     1803.675604105 ALL Stopped 0.906205998082 
     1814.58244991302 MainThread island_destroy_thread  
     182 
     183--- Thread Time Consumption --- 
     184 
     185Thread: 
     186  Name: MainThread 
     187  Total 'filewrite': 1.00073313713 (35.82%) 
     188  Total 'live': 2.79374980927 (100.0%) 
     189  Total 'stopped': 1.61082721161 (57.66%) 
     190  Total 'cpu': 0.182189460521 (6.52%) 
     191 
     192All Threads (Global) 
     193  Total 'filewrite': 1.00073313713 (35.82%) 
     194  Total 'live': 2.79374980927 (100.0%) 
     195  Total 'stopped': 1.61082721161 (57.66%) 
     196  Total 'cpu': 0.182189460521 (6.52%) 
     197 
     198}}} 
     199 
     200Now, lets look at the output produced. At the top we have our "Island Summary" and "Global Sum". This shows us that the 
     201Island has only a single thread over its life, and that the only resources it used were a single file handle, 200K bytes worth 
     202of filewrite and about ~0.18 seconds of actual CPU time. Since we wrote 2x our filewrite limit (100K) to a single file, 
     203this all seems consistent. 
     204 
     205Next, we see a timeline of activity in the Island. The MainThread has a "island_create_thread" event which indicates that 
     206it has "joined" the Island at a certain time. Then a file is opened, we start the call to file.write() and eventually return from it. 
     207Notice the difference between the T.O.C and the T.O.R from the file.write of 1 second. This reflects what we expected, since 
     208we wrote twice our limit. After that, it shows that "ALL" the threads were stopped at 2.87 and 3.67 seconds for 0.70 and 0.91 seconds 
     209respectively. This corresponds to the call to use_cpu() which is presumably using as much CPU as possible. Finally, we see 
     210"island_destroy_thread" which indicates that the thread has left the Island or terminated. 
     211  
     212Lastly, we have our thread time consumption. Since we only have a single thread, the Global usage, and the usage of the 
     213"MainThread" are the same, so we an just focus on one. The first key entry is the 'live' entry. This indicates the total amount 
     214of wall time that this thread was part of the Island. We are reporting about 2.8 seconds, which can be checked by looking at 
     215the island_create_thread and island_destroy_thread times above. Next, we have the "filewrite" entry. It indicates that 1 second 
     216was spent writing to a file, which is correct and accounts for about 35% of the life of the program. One might expect the remaining 
     217time to be allocated to "cpu", since all we did was perform some computation in a loop. However, since we are throttled to 10% CPU usage, 
     218a more significant portion of our time was spent stopped. In face, the proportion of time stopped and in "cpu" is no coincidence. 
     219Notice that "cpu" / ("cpu" + "stopped") = 0.10 which matches our CPU restriction. 
     220 
     221Analyzing this information reveals that the greatest speed-up to this program would be from allocating more CPU, to reduce the amount 
     222of time spent in the "stopped" state. 
     223 
     224[[BR]][[BR]] 
     225=== Multi-threaded Program === 
     226 
     227I've taken the example program from above, but replaced the last two lines with the following: 
     228 
     229{{{ 
     230             
     231settimer(0.1, write_data, ()) 
     232settimer(0.1, use_cpu, ()) 
     233sleep(0.2) 
     234 
     235}}} 
     236 
     237This has the affect of running each method in a separate thread, and to make the MainThread spend most of it's short life sleeping. 
     238I've also reduced the CPU allocation to 1%. Now, lets look at the output this produces: 
     239 
     240{{{ 
     241$ python repy.py restrictions.test dylink.repy prgm_stat.repy test_basic.repy  
     242 
     243  
     244--- Island Summary --- 
     245 
     246Total threads: 3 
     247Live threads: 0 
     248Total files: 1 
     249Open files: 0 
     250Total sockets: 0 
     251Open sockets: 0 
     252 
     253--- Global Sum --- 
     254 
     255Total 'fileread': 0 
     256Total 'filesopened': 1 
     257Total 'tcpsend': 0 
     258Total 'random': 0 
     259Total 'insockets': 0 
     260Total 'filewrite': 200000 
     261Total 'events': 2 
     262Total 'udpsend': 0 
     263Total 'udprecv': 0 
     264Total 'tcprecv': 0 
     265Total 'outsockets': 0 
     266Total 'cpu': 0.103658 
     267 
     268--- Island Timeline --- 
     269 
     270TOC, Thread, API, Amount 
     2718.59868597984 MainThread island_create_thread  
     2728.59897899628 MainThread start call settimer 
     2738.59966897964 MainThread settimer 1 
     2748.59981298447 MainThread start call settimer 
     2758.60012602806 MainThread settimer 1 
     2768.60028600693 MainThread start call sleep 
     2778.67762613297 ALL Stopped 0.294054779492 
     2788.9728770256 MainThread sleep 0.2 
     2798.97305107117 _EVENT:Thread:3 island_create_thread  
     2808.97326397896 _EVENT:Thread:2 island_create_thread  
     2818.97368502617 MainThread island_destroy_thread  
     2828.9745631218 _EVENT:Thread:2 open 1 
     2838.97507095337 _EVENT:Thread:2 start call file.write 
     2849.07225203514 ALL Stopped 10.0952581368 
     28519.168954134 _EVENT:Thread:3 island_destroy_thread  
     28619.1693820953 _EVENT:Thread:2 file.write 200000 
     28719.170222044 _EVENT:Thread:2 island_destroy_thread  
     288 
     289 
     290--- Thread Time Consumption --- 
     291 
     292Thread: 
     293  Name: MainThread 
     294  Total 'live': 0.374999046326 (100.0%) 
     295  Total 'sleep': 0.2 (53.33%) 
     296  Total 'stopped': 0.172591018677 (46.02%) 
     297  Total 'cpu': 0.00240802764893 (0.64%) 
     298 
     299Thread: 
     300  Name: _EVENT:Thread:3 
     301  Total 'live': 10.1959030628 (100.0%) 
     302  Total 'stopped': 10.0952581368 (99.01%) 
     303  Total 'cpu': 0.100644926038 (0.99%) 
     304 
     305Thread: 
     306  Name: _EVENT:Thread:2 
     307  Total 'filewrite': 1.0 (9.81%) 
     308  Total 'live': 10.196958065 (100.0%) 
     309  Total 'stopped': 9.19431114197 (90.17%) 
     310  Total 'cpu': 0.00264692306519 (0.03%) 
     311 
     312All Threads (Global) 
     313  Total 'filewrite': 1.0 (4.82%) 
     314  Total 'live': 20.7678601742 (100.0%) 
     315  Total 'sleep': 0.2 (0.96%) 
     316  Total 'stopped': 19.4621602974 (93.71%) 
     317  Total 'cpu': 0.105699876752 (0.51%) 
     318 
     319}}} 
     320 
     321Now the information at the top has changed slightly. We now report that the Island had a total 
     322of 3 threads, and that 2 events were used, which is correct. 
     323 
     324The timeline is slightly more complicated now, since we see that several threads are inter-mingled. 
     325However, we see that the MainThread first calls settimer twice, and then goes to sleep. 
     326 
     327At that point, repy is stopped for 0.29 seconds. Following that, the two other threads are created, 
     328and the MainThread wakes up from its sleep and exits. Thread "_EVENT:Thread:2" opens a file and beings 
     329to write, when repy is stopped suddenly for 10 seconds, due to thread "_EVENT:Thread" which is just wasting CPU. 
     330Afterward, the two threads finish up and exit. 
     331 
     332Now we can look at the thread time usage information, which is broken down for each thread, 
     333and then given globally. The MainThread is very simple to analyze. All that thread did was 
     334call settimer twice, sleep and then exit. So as expected, we see that 0.2 seconds were spent sleeping, 
     335and essentially the remaining portion of the thread's life was stopped. This seems accurate since the thread 
     336was stopped during the sleep, which extended the amount of time in the API call. 
     337 
     338Then we have the _EVENT:Thread:2 thread which was writing to disk. From before, we expected this file.write 
     339call to take 1 second, but were we were stopped for 10 seconds during the API call. However, the time usage 
     340information accurately reflects that 1 second was spent in "filewrite" and that the remaining 90% of the thread's 
     341life was spent in a stopped state. 
     342 
     343The _EVENT:Thread:3 thread spent its time just wasting CPU. Accordingly, the amount of CPU it was able to 
     344use is that of our CPU restriction (1%). The remaining time, it was stopped to compensate for overuse. 
     345 
     346The last entry is the Global sum, and it just sum's up the information to provide a higher level overview. 
     347What is clear, is that at 93% of the time being "stopped", this program would greatly benefit from having more  
     348CPU allocated to it. 
     349 
     350 
     351 
     352 
     353 
     354 
     355 
     356 
     357 
     358 
     359 
     360