Technical Reports
The Department of Electrical and Computer Engineering publishes a Technical Report series. Technical reports are intended primarily for the long-term archival of results and descriptions that are not suitable for publication elsewhere, due to their length or nature. Technical reports are also the most common way to make ECE Master and Ph.D. thesis widely available. All technical reports are available online in PDF.
How to Format
The ECE recommends the following guideline for formatting a technical report:
- letter paper (8in × 11in)
- minimum margins 1in on all sides
- at least 11 points fonts (times font)
- abstract with 300-500 words containing only letters, numbers, punctuations, but not special or math symbols
- comprehensive reference ordered by the last name of the first authors
- figures and tables at the top or bottom of a page, without breaking a flow of sentences
- page number at the right upper corner
- LaTeX users may use IEEEtran class that is publicly available in the CTAN (a request to install IEEEtran into ECE Unix was already made), and apply the following class options and set up:
- \documentclass[11pt,journal,onecolumn,oneside]{IEEEtran} \setlength{\oddsidemargin}{0in}
- \setlength{\evensidemargin}{0in}
- \setlength{\topmargin}{0in}
- \addtolength{\topmargin}{-\headsep} \addtolength{\topmargin}{-\headheight} \setlength{\textheight}{9in} \setlength{\textwidth}{6.5in}
- MS Word users can download a recommended template that adjusted margins from the IEEE template
How to Submit
You can send your PDF file to Jane Simpson to request a technical report.
Archive
- TR-ECE-09-01: Simulation Analysis about Higher Performance of GDCF Over DCF in IEEE 802.11 MAC Layer Protocol,
Feng An and Dongsoo Kim
This paper introduced an enhanced version of DCF, called gentle DCF. In the ordinary IEEE802.11 DCF protocol, the IEEE802.11 DCF decreases the contention window to the initial value after each success transmission, which essentially assumes that each successful transmission is an indication that the system is under low traffic loading. GDCF takes a more conservative measure by halving the contention window size after c conservative successful transmissions. This gentle decrease can reduce the collision probability. The GDCF compute the optimal value for c, and the numerical results from both analysis and simulation demonstrate that GDCF significantly improve the performance of 802.11 DCF.
- TR-ECE-08-07: Efficient Approximation of Exponentially Weighted Moving Average, Gennady Vayl, and Dongsoo S. Kim
This paper presents a computationally efficient approach to integer approximation of an exponentially weighted moving average. Rather than the conventional approach which uses floating-point operations, this approach uses integer approximation with a remainder-based rounding scheme. For applications that lack a hardware floating-point unit, the proposed approach is proven to execute faster and be even more accurate than floating-point emulation methods.
- TR-ECE-08-06: Analysis of Effective Connectivity in Mobile Wireless Communications,
J. David Haughs, and Dongsoo S. Kim
This research demonstrates the effect of a deployed region's boundaries on the effective coverage of a mobile node. A node's coverage area is not uniform throughout the entire deployed region. Assuming a uniform coverage can result in significant error in calculations. In this study, we analyze the behavior of a node's coverage area as a function of its transmission range throughout the entire deployed region. Using this analysis, a mathematical model for effective coverage in mobile wireless communications is created. The mathematical model considers the effect of the deployed region's boundaries on the coverage area of a mobile node. Lastly, we present simulation results to verify the analytical model and to compare this model with that of a uniform coverage.
- TR-ECE-08-05: An Educational MIPS64 Simulator Designed in C#,
I. Seo, P. Vaidya and J. Lee
Even for ECE students, understanding well how a computer works is typically difficult. Furthermore, understanding the inside operation of computer architecture is even more challenging. Thus, to help students grasp this topic easier, we have developed a MIPS64 educational instruction set simulator with visualization using recent new programming language C#. We have chosen MIPS64 architecture because it is most often taught in many computer architecture or organization courses. The contributions of this work are the following. First, this simulator visualizes the contents of architectural components such as general as well as special registers. Second, it shows graphically the transition of code from C source statements to assembly instructions to machine binaries. Third, it allows users to step each instruction or execute with breakpoint support. Therefore, we believe that this simulator is very useful for many EE, ECE and CS students.
- TR-ECE-08-04: Implementation of Computer Architecture Description Language (CADL) Compiler,
C. Barnes, P. Vaidya and J. Lee
Computer architecture simulation has always played a pivotal role in continuous innovation of computers. Thus, the Computer Architecture Lab in the ECE department of IUPUI is dedicated to develop the most promising computer architecture simulation framework for the multithreaded simulation of both uni-core and multi-core processors. Along this line of research, the CADL Compiler was developed to extend the multithreaded simulator currently in development at the ECE Department Computer Architecture Lab. The purpose of the compiler is to process the proprietary Computer Architecture Description Language (CADL) and output a fully functioning multithreaded simulator program. Utilizing an industry standard extensible markup language called XML, CADL is a language developed at the ECE Department of IUPUI to describe the functionality and architecture of the processor for simulation. The CADL compiler is necessary to extend the simulator by allowing users to easily and quickly modify the structure, instruction set, and execution of the processor modeled within the multithreaded simulator.
- TR-ECE-08-03: Vectorized Database Algorithms Using A Novel Multicontext Coarse-Grained FPGA,
P. Vaidya and J. Lee
Reconfigurable Logic (RL) coprocessors may be a promising solution for hardware acceleration of databases. In this article, we propose a multi-context, coarse-grained Reconfigurable coprocessor Unit (RU) model that is used to accelerate database operations in hardware for column-oriented databases. We then describe the implementation of hardware algorithms for the equi-join, non-equi-join and inverse lookup database operations. Finally, we evaluate the hardware algorithms using a query that is similar to one of the TPC-H queries. Our results indicate that the query execution on the proposed RU model is one to two orders of magnitude faster than the software only query execution. Note: This report will be removed once the same manuscript is published.
- TR-ECE-08-02: A Parallel Deadlock Detection Algorithm with O(1) Overall Run-time Complexity and Its Hardware Implementation,
X. Xiao and J. Lee
Due to rapid technology advance, Multiprocessor System-on-Chips (MPSoCs) are likely to become commodity computing platforms for embedded applications. In the future, it is possible that an MPSoC is equipped with a large number of processing elements as well as on-chip resources. The management of these processing elements and resources faces many challenges, among which deadlock is one of the most crucial issues. This article presents a novel hardware-oriented deadlock detection algorithm suitable for current and future MPSoCs. Unlike previously published methods, whose run-time complexities are often affected by the number of processing elements and resources in the system, the proposed algorithm has O(1) overall run-time complexity. Such complexity is achieved by (i) classifying resource allocation events; (ii) for each specific type of events, performing a set of specific detection and/or preparation operations that only takes constant run-time; and (iii) updating necessary information for multiple resources in parallel in hardware. The proposed algorithm can be implemented as a specialized unit on MPSoCs with small area overhead. We implement the algorithm in Verilog HDL and demonstrate in the simulation that each algorithm invocation takes at most four clock cycles.
- TR-ECE-08-01: Automatic Glycemic Alert System,
S. Lee and D. Kim
A large randomized controlled study from Leuven, Belgium demonstrated that normalization of blood glucose levels using an intensive insulin infusion protocol (IIP) improved clinical outcomes in patients admitted to a surgical intensive care unit (ICU) (6). In the Leuven study, intensive insulin therapy (to maintain blood glucose target levels between 80 and 110 mg/dl) reduced ICU mortality by 42%. Based on this clinical evidence, there are increasing efforts worldwide to maintain strict glycemic control in critically ill patients. These efforts have developed standardized glucose control protocols, such as Insulin Infusion Protocols (IIP), that adjust insulin infusion rate tightly to meet the glucose target level. Such protocols are implemented for inpatients admitted to the hospital. Diabetes outpatients could maintain the same lower and stable glucose levels if they monitor their blood glucose level in regular base and apply similar glucose control protocol at home. However, it is not easy for patients to implement the protocol themselves without help of health care professionals. They cannot remember all the rules and get adjust insulin dosage whenever they check their blood glucose level. To help those under conventional insulin therapy, this article describes Automatic Glycemic Alert System that implements glucose control protocol in a central database as rules and instructions. The rules in the database indicate how much insulin (or carbohydrate in case of hypoglycemia event) a patient needs to take for each range of glucose level. The rules are specific for each individual patient. When the system gets a patient’s blood glucose test result from wireless glucose tester, a system trigger is automatically activated to compare the result to the rules in the database to see if the result falls into one of the rules. If it does, the trigger generates an instruction that tells the patient how much insulin or carbohydrate she should take to control glucose level as the target range. Then, it delivers the instruction to the patient’s mobile phone as a text message or regular Email. This experiment shows the possibility to use modern technology as a communication bridge between healthcare professional and outpatients. With the help of technology, healthcare professional can outreach to their outpatients and take care of them as they were in the hospital.
- TR-ECE-07-06: A case for staged database paradigm using CSP,
P. Vaidya and J. Lee
As millions and possibly billions of transistors become available on a single chip, most future processor architectures will be multicore processors with many concurrent execution units. Such multicore processor architectures present both novel challenges as well as opportunities for database system design. Recent research indicates that pipelined relational algorithms and staged databases might perform better than traditional database architectures on such multicore processor architectures. In this paper, we present a pipelined, partitioned hash-join algorithm developed using C++CSP - a threading library based on formal specification of communicating sequential processes. Previous research has shown that the size of buffers is crucial to performance of pipelined hash-join algorithms. In this paper, we corroborate these previous results in the context of C++CSP library and show that a C++ library can be used for constructing pipelined relational algorithms and eventually a staged database pipeline. Our research shows that the performance of pipelined algorithms implemented with this library are comparable to traditional thread libraries. Furthermore, this approach provides opportunities to explore formal methods for implementation of pipelined algorithms in relational databases.
- TR-ECE-07-05: R-Tree: A hardware implementation,
X. Xiao, T. Shi, P. Vaidya and J. Lee
R-tree data structures are widely used in spatial databases to store spatial information and guide database search. Hence, the performance of the search operation in R-tree data structures has a important impact on the performance of query processing in spatial databases. To provide excellent performance of R-tree search operations, we propose a new parallel R-tree search algorithm, which utilizes an adjacency matrix representation for R-trees and performs the search by binary arithmetic among matrix elements. When implemented in hardware, we demonstrated through our simulation that the run-time complexity of the new algorithm is bounded by the height of a R-tree. Furthermore, we find that the proposed algorithm in hardware is 30 times faster than its software counterpart in solving an example search problem. In the future, more research will be conducted to adapt the proposed algorithm with divide-conquer paradigm to solve larger problems.
- TR-ECE-07-04: Main memory DBMS on modern processors, a simulation approach for database performance characterization,
X. Xiao and J. Lee
Database applications are an important type of workloads that is very different from other types of application workloads such as SPEC benchmarks. In the past, much research has been devoted to evaluate performance characteristics of database workloads on various processor architectures. While most of the previous performance evaluations have been done using hardware counters, many recently developed full system simulators have provided another method for performance characterization. With the simulation paradigm, some architectural features of processors that were hard to examine using hardware counters are now accessible. However, results of simulations have to be validated against real machine executions to ensure their fidelity. In this report, we use the Simics full-system simulator to measure performance of architectural features of a simulated Pentium 4 alike PC running TPC-H workload on a MonetDB (a main memory database management system). Then, we compare our simulation results with some well-established results published in literature previously. We find that our simulation results are mostly consistent with previously published results. In our simulation, approximately 83% of query execution time in the database is spent on various stalls. Memory stalls and resource-related stalls are the most prominent ones for all TPC-H queries under test. With the confidence of the fidelity of the simulation results, we are now able to extend the framework in various ways to carry out more in depth experiments for database workloads in the future.
- TR-ECE-07-03: Main memory DBMS on modern processors, our perspective,
P. Vaidya and J. Lee
In 1999, researchers characterized the performance of four commercial databases on the then modern processors. The work titled “DBMSs on Modern Processor: Where Does Time Go?” showed that databases spend as much as half their time on processor stalls and consequently spurred research to further accelerate databases. In this paper, we present our perspective as of 2007 based on the same central idea of database performance characterization and show that despite several performance optimizations found in modern databases and processors, databases still spend a significant amount of execution time in stalls. Our approach differs from the aforementioned paper in the following ways. Firstly, we characterize the performance of TPC-H benchmark instead of the querying framework explored in the aforementioned work. Secondly, we characterize the performance of database processes alone as opposed to the previous system wide monitoring approaches. This enables us to characterize the performance of databases independent of any other executing processes. Thirdly, due to the advancements in processor performance monitoring hardware and software, we are able to delve deeper into the nature of the stalls and identify the key microarchitectural components that cause performance impediments in databases. As a result, we not only corroborate some of the previous findings but also present a more detailed insight into how database performance can be boosted by engineering the modern processors and databases specifically to overcome performance bottlenecks.
- TR-ECE-07-02: An O(min(m,n)) Parallel Deadlock Detection Algorithm and Hardware for Multi-unit Resource Systems,
Xiang Xiao and Jaehwan John Lee
This report describes a novel parallel Multi-unit resource Deadlock Detection Algorithm (MDDA) and its hardware implementation (MDDU). The contributions are (i) the first O(1) hardware deadlock detection, (ii) reduced O(min(m, n)) preparation, where m and n are the number of processes and resources, respectively, and (iii) support for multi-unit resources. O(min(m,n)), previously O(m×n), is achieved by performing all the searches for sink nodes for each and every resource in parallel in hardware over two matrices representing resource allocations as well as other auxiliary matrices. Moreover, we prove the correctness and run-time complexity of MDDA. MDDU provides a fast and deterministic deadlock detection mechanism for Multiprocessor System-on-Chips (MPSoCs), which we predict will become prevalent in the near future in system designs. Our experiments demonstrate that MDDU always takes two clock cycles to detect deadlock regardless the size of the system. Lastly, the MPSoC area overhead due to MDDU is small, approximately 0.024 percent for MDDU16x16 on our example MPSoC.
- TR-ECE-07-01: Swarm Group Mobility Model for Ad Hoc Wireless Networks,
Dongsoo S. Kim, and Seok K. Hwang
This paper proposes a new group mobility model for wireless communication. The mobility model considers the psychological and sociological behavior of each node and the perception of other nodes for describing interactions among a set of nodes. The model assumes no permanent membership of a group, capable of capturing natural behaviors as fork and join. It emulates a cooperative movement pattern observed in mobile ad hoc networks of military operation and campus, in which a set of mobile stations accomplish a cooperative motion affected by the individual behavior as well as a group behavior. The model also employs a physic model to avoid a sudden stopping and a sharping turning.