Click to open the HelpDesk interface
AECE - Front page banner

Menu:


FACTS & FIGURES

JCR Impact Factor: 0.595
JCR 5-Year IF: 0.661
Issues per year: 4
Current issue: May 2018
Next issue: Aug 2018
Avg review time: 106 days


PUBLISHER

Stefan cel Mare
University of Suceava
Faculty of Electrical Engineering and
Computer Science
13, Universitatii Street
Suceava - 720229
ROMANIA

Print ISSN: 1582-7445
Online ISSN: 1844-7600
WorldCat: 643243560
doi: 10.4316/AECE


TRAFFIC STATS

1,963,176 unique visits
535,194 downloads
Since November 1, 2009



No robots online now


SJR SCImago RANK

SCImago Journal & Country Rank


SEARCH ENGINES

aece.ro - Google Pagerank




TEXT LINKS

Anycast DNS Hosting
MOST RECENT ISSUES

 Volume 18 (2018)
 
     »   Issue 2 / 2018
 
     »   Issue 1 / 2018
 
 
 Volume 17 (2017)
 
     »   Issue 4 / 2017
 
     »   Issue 3 / 2017
 
     »   Issue 2 / 2017
 
     »   Issue 1 / 2017
 
 
 Volume 16 (2016)
 
     »   Issue 4 / 2016
 
     »   Issue 3 / 2016
 
     »   Issue 2 / 2016
 
     »   Issue 1 / 2016
 
 
 Volume 15 (2015)
 
     »   Issue 4 / 2015
 
     »   Issue 3 / 2015
 
     »   Issue 2 / 2015
 
     »   Issue 1 / 2015
 
 
  View all issues  








LATEST NEWS

2017-Jun-14
Thomson Reuters published the Journal Citations Report for 2016. The JCR Impact Factor of Advances in Electrical and Computer Engineering is 0.595, and the JCR 5-Year Impact Factor is 0.661.

2017-Apr-04
We have the confirmation Advances in Electrical and Computer Engineering will be included in the EBSCO database.

2017-Jan-30
We have the confirmation Advances in Electrical and Computer Engineering will be included in the Gale database.

Read More »


    
 

  1/2018 - 4

An Automatic Instruction-Level Parallelization of Machine Code

MARINKOVIC, V. See more information about MARINKOVIC, V. on SCOPUS See more information about MARINKOVIC, V. on IEEExplore See more information about MARINKOVIC, V. on Web of Science, POPOVIC, M. See more information about  POPOVIC, M. on SCOPUS See more information about  POPOVIC, M. on SCOPUS See more information about POPOVIC, M. on Web of Science, DJUKIC, M. See more information about DJUKIC, M. on SCOPUS See more information about DJUKIC, M. on SCOPUS See more information about DJUKIC, M. on Web of Science
 
Click to see author's profile on See more information about the author on SCOPUS SCOPUS, See more information about the author on IEEE Xplore IEEE Xplore, See more information about the author on Web of Science Web of Science

Download PDF pdficon (1,217 KB) | Citation | Downloads: 112 | Views: 190

Author keywords
parallel architectures, parallel programming, multicore processing, assembly, processor scheduling

References keywords
parallel(13), code(10), parallelization(9), automatic(8), systems(7), popovic(6), washington(4), programming(4), program(4), micro(4)
Blue keywords are present in both the references section and the paper title.

About this article
Date of Publication: 2018-02-28
Volume 18, Issue 1, Year 2018, On page(s): 27 - 36
ISSN: 1582-7445, e-ISSN: 1844-7600
Digital Object Identifier: 10.4316/AECE.2018.01004
SCOPUS ID: 85043242372

Abstract
Quick view
Full text preview
Prevailing multicores and novel manycores have made a great challenge of modern day - parallelization of embedded software that is still written as sequential. In this paper, automatic code parallelization is considered, focusing on developing a parallelization tool at the binary level as well as on the validation of this approach. The novel instruction-level parallelization algorithm for assembly code which uses the register names after SSA to find independent blocks of code and then to schedule independent blocks using METIS to achieve good load balance is developed. The sequential consistency is verified and the validation is done by measuring the program execution time on the target architecture. Great speedup, taken as the performance measure in the validation process, and optimal load balancing are achieved for multicore RISC processors with 2 to 16 cores (e.g. MIPS, MicroBlaze, etc.). In particular, for 16 cores, the average speedup is 7.92x, while in some cases it reaches 14x. An approach to automatic parallelization provided by this paper is useful to researchers and developers in the area of parallelization as the basis for further optimizations, as the back-end of a compiler, or as the code parallelization tool for an embedded system.


References | Cited By  «-- Click to see who has cited this paper

[1] L. Hochstein, J. Carver, F. Shull, S. Asgari, V. Basili, "Parallel programmer productivity: A case study of novice parallel programmers," Proceedings of the 2005 ACM/IEEE conference on Supercomputing (SC '05), Washington, pp. 35-43, 2005.
[CrossRef] [SCOPUS Times Cited 49]


[2] M. Popovic, M. Djukic, V. Marinkovic, N. Vranic, "On task tree executor architectures based on Intel parallel building blocks," Computer Science and Information Systems, vol. 10, no. 1, pp. 369-392, 2013.
[CrossRef] [Web of Science Times Cited 1] [SCOPUS Times Cited 2]


[3] R. Chandra, R. Menon, L. Dagum, D. Kohr, D. Maydan, J. McDonald, "Parallel programming in OpenMP", pp. 157-159, Academic press, 2001, ISBN: 1558606718.

[4] D.B. Kirk, W.W. Hwu, "Programming massively parallel processors", pp. 68-70, Mogran Kaufmann Publishers, 2010, ISBN: 0124159923.

[5] A. Bhattacharjee, G. Contreras, M. Martonosi, "Parallelization Libraries: Characterizing and Reducing Overheads," ACM Trans. Archit. Code Optim, vol. 8, no. 1, pp. 5:1-5:29, 2011.
[CrossRef] [Web of Science Times Cited 11] [SCOPUS Times Cited 18]


[6] A. Kotha, K. Anand, M. Smithson, G. Yellareddy, R. Barua, "Automatic Parallelization in a Binary Rewriter," Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 43), Washington, pp. 547-557, 2010.
[CrossRef] [SCOPUS Times Cited 24]


[7] G. Karypis, V. Kumar, "A fast and high quality multilevel scheme for partitioning irregular graphs," SIAM Journal of Scientific Computing, vol. 20, no. 1, pp. 359-392, 1998.
[CrossRef] [Web of Science Times Cited 1657] [SCOPUS Times Cited 2329]


[8] X. Wang, S. Thota, "A resource-efficient communication architecture for chip multiprocessors on FPGAs," J. Comput. Sci. Technol., vol. 26, no. 3, pp. 434-447, 2011.
[CrossRef] [Web of Science Times Cited 3] [SCOPUS Times Cited 4]


[9] U. Vishkin, "Is multicore hardware for general-purpose parallel processing Broken?," Communications of the ACM, vol. 57, no. 4, pp. 35-39, 2014.
[CrossRef] [Web of Science Times Cited 5] [SCOPUS Times Cited 5]


[10] M. Djukic, M. Popovic, N. Cetic, I. Povazan, "Embedded Processor Oriented Compiler Infrastructure," Advances in Electrical and Computer Engineering, vol. 14, no. 3, pp. 123-130, 2014.
[CrossRef] [Full Text] [Web of Science Times Cited 1] [SCOPUS Times Cited 1]


[11] N. Vranic, V. Marinkovic, M. Djukic, M. Popovic, "An approach to parallelization of sequential C code," 2011 Second Eastern European Regional Conference on the Engineering of Computer Based Systems, Bratislava, pp. 143-146, 2011.
[CrossRef] [Web of Science Times Cited 1] [SCOPUS Times Cited 3]


[12] D. Kovacevic, M. Stanojevic, V. Marinkovic, M. Popovic, "A solution for automatic parallelization of sequential assembly code," Serbian Journal of Electrical Engineering, vol. 10, no. 1, pp. 91-101, 2013.
[CrossRef]


[13] K. Kyriakopoulos, K. Psarris, "Non-linear symbolic analysis for advanced program parallelization," IEEE Transactions on Parallel and Distributed Systems, vol. 20, no. 5, pp. 623-640, 2009.
[CrossRef] [Web of Science Times Cited 3] [SCOPUS Times Cited 4]


[14] G. Ottoni, R. Rangan, A. Stoler, D. I. August, "Automatic thread extraction with decoupled software pipelining," Proceedings of the 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 38), Washington, pp. 105-118, 2005.
[CrossRef] [SCOPUS Times Cited 152]


[15] S. Campanoni, T. Jones, G. Holloway, V. J. Reddi, G. Y. Wei, D. M. Brooks, "HELIX: Automatic Parallelization of Irregular Programs for Chip Multiprocessing," Proceedings of the Tenth International Symposium on Code Generation and Optimization (CGO '12), San Jose, pp. 84-93, 2012.
[CrossRef] [SCOPUS Times Cited 38]


[16] C. Dave, H. Bae, S. Min, S. Lee, R. Eligenmann, S. Midkiff, "Cetus: A source-to-source compiler infrastructure for multicores," Computer, vol. 42, no. 12, pp. 36-42, 2009.
[CrossRef] [Web of Science Times Cited 47] [SCOPUS Times Cited 86]


[17] M. Mathews , J. P. Abraham, "Automatic Code Parallelization with OpenMP task constructs," Proceedings of the 2016 International Conference on Information Science (ICIS '16), Kochi, pp. 233-238, 2016.
[CrossRef] [SCOPUS Times Cited 1]


[18] E. Yardimci, M. Franz, "Dynamic parallelization and mapping of binary executables on hierarchical platforms," Proceedings of the 3rd Conference on Computing Frontiers (CF '06), Ischia, pp. 127-138, 2006.
[CrossRef] [SCOPUS Times Cited 14]


[19] W. Liu, J. Tuck, L. Ceze, W. Ahn, K. Strauss, J. Renau, J. Torrellas, "POSH: a TLS compiler that exploits program structure," Proceedings of the Eleventh ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP '06), New York, pp. 158-167, 2006.
[CrossRef]


[20] H. Kim, N. P. Johnson, J. W. Lee, S. A. Mahlke, D. I. August, "Automatic speculative DOALL for clusters," Proceedings of the Tenth International Symposium on Code Generation and Optimization (CGO '12), San Jose, pp. 94-103, 2012.
[CrossRef] [SCOPUS Times Cited 20]


[21] T. Oh, S. R. Beard, N. P. Johnson, S. Popovych, D. I. August, "A Generalized Framework for Automatic Scripting Language Parallelization," Proceedings of the 21st International Conference on Parallel Architectures and Compilation Techniques (PACT ’17), Portland, pp. 356-369, 2017.
[CrossRef] [Web of Science Times Cited 1] [SCOPUS Times Cited 2]


[22] C. Wang, X. Li, J. Zhang, X. Zhou, X. Nie, "MP-Tomasulo: A Dependency-Aware Automatic Parallel Execution Engine for Sequential Programs," ACM Trans. Archit. Code Optim, vol. 10, no. 2, pp. 9:1-9:26, 2013.
[CrossRef] [Web of Science Times Cited 15] [SCOPUS Times Cited 18]


[23] Y. Dou, J. Zhou, G.-M. Wu, J.-F. Jiang, Y.-W. Lei, S.-C. Ni, "A unified co-processor architecture for matrix decomposition," J. Comput. Sci. Technol., vol. 25, no. 4, pp. 874-885, 2010.
[CrossRef] [Web of Science Times Cited 3] [SCOPUS Times Cited 5]


[24] M. Dali, A. Guessoum, R. M. Gibson, A. Amira, N. Ramzan, "Efficient FPGA Implementation of High-Throughput Mixed Radix Multipath Delay Commutator FFT Processor for MIMO-OFDM, " Advances in Electrical and Computer Engineering, vol.17, no.1, pp. 27-38, 2017.
[CrossRef] [Full Text] [Web of Science Times Cited 1] [SCOPUS Times Cited 2]


[25] D. Capko, A. Erdeljan, G. Svenda, M. Popovic, "Dynamic repartitioning of large data model in distribution management systems," Electronics and Electrical Engineering, vol. 120, no. 4, pp. 83-88, 2012.
[CrossRef] [Web of Science Times Cited 1]


[26] D. Capko, A. Erdeljan, M. Popovic, G. Svenda, "An optimal initial partitioning of large data model in utility management systems," Advances in Electrical and Computer Engineering, vol. 11, no. 4, pp. 41-46, 2011.
[CrossRef] [Full Text] [Web of Science Times Cited 8] [SCOPUS Times Cited 8]


[27] A. H. Hormati, Y. Choi, M. Kudlur, R. Rabbah, T. Mudge, S. Mahlke, "Flextream: Adaptive compilation of streaming applications for heterogeneous architectures," The 18th Int. Conf. on Parallel Arch. and Compilation Techn., Washington, pp. 214-223, 2009.
[CrossRef] [Web of Science Times Cited 22] [SCOPUS Times Cited 58]


[28] A. V. Aho, M. S. Lam, R. Sethi, J. D. Ullman, "Compilers: principles, techniques, & tools", pp. 369-370, Addison-Wesley, 2007, ISBN: 0321486811.

[29] A.J. Bernstein, "Analysis of programs for parallel processing," IEEE Transactions on Electronic Computers, vol. EC-15, no. 5, pp 757-763, 1966.
[CrossRef] [SCOPUS Times Cited 185]


[30] S. Debray, R. Muth, M. Weippert, "Alias analysis of executable code," Proceedings of the 25th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL '98), San Diego, pp. 12-24, 1998.
[CrossRef]


[31] W. Amme, P. Braun, F. Thomasset, E. Zehendner, "Data dependence analysis of assembly code," Int. J. Parallel Program., vol. 28, no. 5, pp. 431-467, 2000.
[CrossRef] [Web of Science Times Cited 9] [SCOPUS Times Cited 21]


[32] C. Wimmer, M. Franz, "Linear scan register allocation on SSA Form," Proceedings of the 8th Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO '10), Toronto, pp. 170-179, 2010.
[CrossRef] [SCOPUS Times Cited 19]


[33] M. Puletto, V. Sarkar, "Linear Scan Register Allocation," ACM Trans. Program. Lang. Syst., vol. 21, no. 5, pp. 895- 913, 1999.
[CrossRef] [Web of Science Times Cited 125] [SCOPUS Times Cited 184]


[34] G. Matheou, P. Evripidou, "Verilog-based simulation of hardware support for data-flow concurrency on multicore systems," Proceedings of the 2013 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation, pp. 280-287, Samos, 2013.
[CrossRef] [SCOPUS Times Cited 4]




References Weight

Web of Science® Citations for all references: 1,914 TCR
SCOPUS® Citations for all references: 3,256 TCR

Web of Science® Average Citations per reference: 55 ACR
SCOPUS® Average Citations per reference: 93 ACR

TCR = Total Citations for References / ACR = Average Citations per Reference

We introduced in 2010 - for the first time in scientific publishing, the term "References Weight", as a quantitative indication of the quality ... Read more

Citations for references updated on 2018-06-15 19:38 in 217 seconds.




Note1: Web of Science® is a registered trademark of Clarivate Analytics.
Note2: SCOPUS® is a registered trademark of Elsevier B.V.
Disclaimer: All queries to the respective databases were made by using the DOI record of every reference (where available). Due to technical problems beyond our control, the information is not always accurate. Please use the CrossRef link to visit the respective publisher site.

Copyright ©2001-2018
Faculty of Electrical Engineering and Computer Science
Stefan cel Mare University of Suceava, Romania


All rights reserved: Advances in Electrical and Computer Engineering is a registered trademark of the Stefan cel Mare University of Suceava. No part of this publication may be reproduced, stored in a retrieval system, photocopied, recorded or archived, without the written permission from the Editor. When authors submit their papers for publication, they agree that the copyright for their article be transferred to the Faculty of Electrical Engineering and Computer Science, Stefan cel Mare University of Suceava, Romania, if and only if the articles are accepted for publication. The copyright covers the exclusive rights to reproduce and distribute the article, including reprints and translations.

Permission for other use: The copyright owner's consent does not extend to copying for general distribution, for promotion, for creating new works, or for resale. Specific written permission must be obtained from the Editor for such copying. Direct linking to files hosted on this website is strictly prohibited.

Disclaimer: Whilst every effort is made by the publishers and editorial board to see that no inaccurate or misleading data, opinions or statements appear in this journal, they wish to make it clear that all information and opinions formulated in the articles, as well as linguistic accuracy, are the sole responsibility of the author.




Website loading speed and performance optimization powered by: