ABSTRACT
Performance analysis is the task of monitoring the behaviour of a program execution. The main goal is to find out the possible adjustments that might be done in order to improve the performance of the computer system in use. To be able to get that improvement, it is necessary to find the different causes/contributors of overhead. Today, we are already in the multicore era, but there is a gap between the level of development of the two main divisions of multicore technology (hardware and software).
This project is focused on the issues concerning performance analysis, tuning of applications running specifically in a shared memory system and development of application that automatically extract system characteristics and configurations. This application is developed using OODM and implemented using C# programming language and can be used on any windows Operating System. The application developed from this project critically analyses multicore system, determine various causes of overhead in multicore environment,extracts system parameters and present various optimization strategies.
CHAPTER ONE
INTRODUCTION
1.1 Introduction
With computers playing an increasingly critical role in our day-to-day lives, it is important to know their components and how each works and of what impact they impose on performance of the computer system.
According to (Arnold, 1994) Computer performance is characterised by the amount of useful work accomplished by a computer system compared to time and resources used. Depending on the context, good computer performance is dependent on the available system resources. Most computer users do not know the system specification, they lack the knowledge of conventional way of extracting system parameters but with the computerised system in this thesis (Otherwise known as Autospec) every computer users will be able to determine the system configuration by installing and running the software.
The System development can be likened to building a house, this demands adequate planning and preparation in order to meet the objectives of the proposed design.
The parameters or the resources that are of interest in our analysis include the followings:
Summary
Operating system
CPU
RAM
Hard drives
Optical drives Motherboard Graphics
Network
Audio
Peripheral
Performance
Performance analysis is the task of investigating the behaviour of program execution (Mario,
2009). The main aim is to find out the possible adjustments that might be done in order enhance the performance of computer system. Besides, the hardware architecture and software platform (operating system) where a program is executed has impact on its performance. Workload characterization involves studying the user and machine environment, observing key characteristics, and developing a workload model that can be used repeatedly. Once a workload model is available, the effect of changes in the workload and system can be easily evaluated by changing the parameters of the model. This can be achieved by using compiler directives such OpenMP multithread application. In addition, workload characterization can help you to determine what’s normal, prepare a baseline for historical comparison, comply with management reporting, and identify candidates for optimization.
Presently, multicore processors chips are being introduced in almost all the areas where a computer is needed. For example, many laptop computers have a dual core processor inside. High Performance Computing (HPC) address different issues, one of them is the exploitation of the capacities of multicore architecture(Mario, 2009).
Presently, multicore processors chips are being introduced in almost all the areas where a computer is needed. For example, many laptop computers have a dual core processor inside. High Performance Computing (HPC) address different issues, one of them is the exploitation of the capacities of multicore architecture.
Performance analysis and optimization is a field of HPC responsible for analysing the behaviour of applications that perform big amount of computation. Some applications that perform high volume of computations require analysing and tuning. Therefore, in order to achieve better performances it is necessary to find the different causes of overhead.
There are a considerable number of studies related to the performance analysis and tuning of applications for supercomputing, but there are relatively few studies addressed specifically to applications running on a multicore environment.
A multicore system is composed of two or more independent cores (or CPUs). The cores are typically integrated onto a single circuit die (known as a chip multiprocessor or CMP), or they may be integrated onto multiple dies in a single chip package.
This thesis examines the issues involved in the performance analysis and tuning of applications running specifically in a shared Memory and the development of a computerized system for retrieving systems specification for possible changes. Multicore hardware is relatively more mature than multicore software, from that reality arises the necessity of this research. We would like to emphasize that this is an active area of research, and there are only some early results in the academic and industrial worlds in terms of established standards and technology, but much more will evolve in the years to come.
Several years, the computer technology has been going through a phase of many developments. Based on Moore law, the speed of processors has been increasing very fast. Every new generation of micro-processor comes with clock rate usually twice or even much faster than the previous one. That increase in clock frequency drove increases in the processors performance, but at the same time, the difference between the processors speed and memory speed was increasing. Such gap was temporarily solved by instruction level parallelism (ILP) (Faxen et al,
2008). Exploiting ILP means executing instructions that occur close to each other in the stream of instructions through the processor in parallel. Though it appeared very soon that more and more cycles are being spent not in the processor core execution, but in the memory subsystem which includes the multilevel caching structure, and the so-called Memory Wall, problem started to evolve quite significantly due to the fact that the increase in memory speed didn’t match that of processor cores.
Very soon a new direction for increasing the overall performance of computer systems had been proposed, namely changing the structure of the processor subsystem to utilize several processor cores on a single chip. These new computer architectures received the name of Chip Multi Processors (CMP) and provided increased performance for new generation of systems, while keeping the clock rate of individual processors cores at a reasonable level. The result of this architectural change is that it became possible to provide further improvements in performance while keeping the power consumption of the processor subsystem almost constant, the trend which appears essential not only to power sensitive market segments such as embedded systems, but also to computing server farms which suffer power consumption/dissipation problems as well.
Some of the advantages that shared memory CMP may offer are:
• Direct access to data through shared memory address space.
• Greater latency hiding mechanism.
• MP’s appears to lower power and cooling requirements per FLOP.
There are two main types of CMP
There are those that contain a few very powerful cores, essentially the same core one would put in a single core processor. Examples include AMD Athlons, Intel Core 2, IBM Power 6 and so on.
There are those systems that trade single core performance for number of cores, limiting core area and power. Examples include the Tilera 64, the Intel Larrabee and the Sun UltraSPARC T1 and T2 (also known as Niagara 1-2). Figure 1.1 shows the basic structure of a dual core processor
Fig. 1.1. An example of a dual core MCP (Multicore Processor) structure.
1.2 Statement of Problem
Multicore hardware technology is advancing very fast. There is a gap between the level of development of the multicore hardware and multicore software technology. Multithread applications may not always take proper advantage of multicore hardware architecture, especially when a user of such system lack the in depth knowledge of the system parameters.
Most people can name processor, maybe how much RAM it has and how big the hard drive is through the stickers or tags on the PC and a user can also get these basic information by right clicking my computer and then clicking on properties but these miss out lots of information that you need to know about your system such as graphic card, motherboard type, cache type and temperature of some hardware etc
1.3 Objective of the Study
The main objectives of this work are:
To develop an application that would help all the computer user to automatically extract system parameters
Determine the main causes of overhead on the multicore environment.
To apply the necessary measures/changes to make a better use of the hardware resources and therefore obtain a better performance.
1.4 Significance of Study
Autospec is a software that acts as a sticker for your PC. It analyses and shows the statistics on every piece of hardware in your computer. Including CPU, Motherboard, RAM, Graphics card, Operating systems, Hard Disks, Optical Drives, Audio support. Additionally Autospec adds the temperatures of your different components, so you can easily see if there is a problem and also an interface for checking for software updates.
It certainlyhelps any PC user especially novices in everyday computing life to Troubleshoot, diagnosis, make replacement, control process and implement some settings etc.
1.5 Scope of Study
The multicore technology may be divided into two main categories, hardware and software. But the application focuses on the hardware. The developed software will be able to extract the systems specification in all multicore computer system running on windows Operating System.
1.6 Definition of Terms
MULTICORE: Multicore refers to an architecture in which a single physical processor incorporates the core logic of more than one processor. A single integrated circuit is used to package or hold these processors. These single integrated circuits are known as a die.
MULTITHREADING: is a type of execution model that allows multiple threads to exist within the context of a process such that they execute independently but share their process resources. A thread maintains a list of information relevant to its execution, including the priority schedule, exception handlers, a set of CPU registers, and stack state in the address space of its hosting process.
TUNNING: is the improvement of systemperformance by varying parameters such as Processor, secondary storage devices, main memory and application modification.
APPLICATION PATHERN: This is a design pattern which covers useful architectural and code design patterns that can lead to the creation of more maintainable and evolvable software.
OpenMP (Open Multi-Processing): is an API that supports multi-platform shared memory multiprocessing programming in C, C++, and Fortran, on most processor architectures and operating systems, including Solaris, AIX, HP-UX, GNU/Linux, Mac OS X, and Windows.
COMPUTER PERFORMANCE: is characterized by the amount of useful work accomplished by a computer system compared to the time and resources used
This material content is developed to serve as a GUIDE for students to conduct academic research
A1Project Hub Support Team Are Always (24/7) Online To Help You With Your Project
Chat Us on WhatsApp » 09063590000
DO YOU NEED CLARIFICATION? CALL OUR HELP DESK:
09063590000 (Country Code: +234)
YOU CAN REACH OUR SUPPORT TEAM VIA MAIL: [email protected]
09063590000 (Country Code: +234)