DEVELOPMENT OF A SMART CAMERA SYSTEM ON AN FPGA by Monica Jane Whitaker A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in Electrical Engineering MONTANA STATE UNIVERSITY Bozeman, Montana November, 2016 c©COPYRIGHT by Monica Jane Whitaker 2016 All Rights Reserved ii ACKNOWLEDGEMENTS I would like acknowledge the faculty and staff of the Electrical and Computer Engineering Department as well as those of the Gianforte School of Computing at Montana State University for their continued support and encouragement throughout my undergraduate and graduate education. Funding Acknowledgment This work was kindly supported by the Montana Research and Economic Development Initiative, Montana Board of Research and Commercialization Technology and Resonon, Inc. iii TABLE OF CONTENTS 1. INTRODUCTION AND BACKGROUND......................................................1 Introduction .................................................................................................1 Hyperspectral Imaging ..................................................................................3 Classifying a Hyperspectral Image.............................................................3 Object Sorting ..............................................................................................4 Smart Cameras .............................................................................................5 Existing Smart Cameras ................................................................................6 Matrix Vision ..........................................................................................6 Matrox Imaging .......................................................................................7 Eye Vision Technology .............................................................................7 Teledyne Dalsa ........................................................................................8 The Winner is ... None of the Above .........................................................8 2. MOTIVATION ........................................................................................... 10 Beneficiaries ............................................................................................... 10 Current Processing System .......................................................................... 11 Why FPGA? .............................................................................................. 12 3. SYSTEM DESIGN...................................................................................... 15 Logistic Regression Algorithm...................................................................... 15 Hardware Elements ..................................................................................... 17 Arria 10 SoC ......................................................................................... 17 Development Board Components ............................................................ 18 Additional Custom Boards ..................................................................... 18 Project Overview ........................................................................................ 20 Camera Interface ................................................................................... 21 DRAM Interface .................................................................................... 23 Computation Unit.................................................................................. 25 Pixel Classification............................................................................ 25 Object Classification ......................................................................... 27 FPGA Interface ..................................................................................... 28 Performance .......................................................................................... 32 iv TABLE OF CONTENTS - CONTINUED 4. IMPLEMENTATION DETAILS .................................................................. 33 Programmable Oscillator ............................................................................. 33 Registers ............................................................................................... 33 Programmable Clock Generator ................................................................... 36 Design Decisions .................................................................................... 36 Registers ............................................................................................... 38 Burning a Configuration ......................................................................... 38 Utilizing the Clock Generator ................................................................. 39 Altera IP.................................................................................................... 39 Timing Constraints ..................................................................................... 40 Toolchain Fights ......................................................................................... 43 SignalTap .............................................................................................. 43 TimeQuest Timing Analyzer................................................................... 44 Chip Planner ......................................................................................... 44 MATLAB.............................................................................................. 45 Toolchain Tricks ......................................................................................... 47 5. TEST AND VERIFICATION...................................................................... 50 Camera Interface ........................................................................................ 50 DRAM ....................................................................................................... 50 Computation Unit....................................................................................... 51 FPGA to FPGA Transmission ..................................................................... 54 6. CONCLUSION........................................................................................... 56 REFERENCES CITED.................................................................................... 57 APPENDICES ................................................................................................ 60 APPENDIX A: Register Descriptions ........................................................ 61 APPENDIX B: VHDL Code ..................................................................... 65 APPENDIX C: MATLAB Code .............................................................. 141 vLIST OF TABLES Table Page 3.1 Pixel Information in DRAM................................................................. 25 4.1 Register settings for Si570.................................................................... 34 A.1 ENABLE Register Description ............................................................. 62 A.2 IRQ ENABLE Register Description...................................................... 62 A.3 IRQ PENDING Register Description .................................................... 62 A.4 NUM BINS Register Description .......................................................... 63 A.5 NUM PIXELS Register Description...................................................... 63 A.6 NUM CLASSES Register Description ................................................... 63 A.7 FRAME COUNT Register Description ................................................. 63 A.8 MEAN Register Description................................................................. 63 A.9 STD DEV I Register Description ......................................................... 63 A.10 COEFFICIENT Register Description.................................................... 63 A.11 INNER PRODUCT Register Description .............................................. 64 A.12 DECISION VECTOR Register Description........................................... 64 vi LIST OF FIGURES Figure Page 1.1 A mock-up of the full system as it is intended to operate. ........................2 1.2 An example of hyperspectral line scan images over several frames.............4 1.3 Robot sorting almonds...........................................................................5 2.1 Depiction of a typical image processing system [1]. ................................ 11 2.2 This is a depiction of the future of image processing, with an integrated camera sensor and FPGA processor [1]. ................................ 12 2.3 Graphical depiction of relative resources in the Arria 10 SoC chip .......... 14 3.1 Example inner product calculation. ...................................................... 16 3.2 High level view of the components external to the SoC utilized in the system. The colored regions depict the individual PCBs............... 19 3.3 The PCBs created for the hyperspectral camera. ................................... 20 3.4 Block diagram of the full system functionality ....................................... 21 3.5 Block diagram of the camera interface subsystem. ................................. 22 3.6 Block diagram of the memory subsystem. ............................................. 24 3.7 Block diagram of the computation subsystem........................................ 26 3.8 The connection between the Arria V development board (top) and the Arria 10 development board (bottom). ..................................... 29 4.1 Factory Default Clock Register Settings for Si570 .................................. 35 4.2 Preferred Clock Register Settings for Si570 ........................................... 35 4.3 Diagram of Pin Assignments for VersaClock 6 Programmable Clock Generator .................................................................................. 37 4.4 A fitted floor plan in the Arria 10......................................................... 46 5.1 Generated plot depicting ratios between the pixels of the line scan camera and the pixels of the hyperspectral camera. ........................ 53 5.2 Zoomed in plot of ratios between monochrome and hyperspec- tral camera pixels ................................................................................ 53 vii ABSTRACT In recent years, hyperspectral cameras have been appearing in many applications that need more information than what conventional color cameras can provide. A hyperspectral camera is able to capture data ranging in wavelengths from the visible spectrum all the way into the infrared. In this way, it is able to ’see’ hundreds of colors, much more than the human eye or any standard camera that typically uses only 3 spectral values (corresponding to the standard red, green, and blue colors). Due to the large amount of data that these cameras can generate at increasingly faster frame rates, conventional computers are not able to perform all the necessary processing in real-time. Because of this limitation, a new system is needed to perform the image processing. This master’s thesis is meant to contribute to the development of a smart camera targeted for hyperspectral image processing using a Field Programmable Gate Array (FPGA) and object sorting with a prototype waterfall system. Through the use of a Hardware Description Language (HDL), a currently used image processing algorithm has been implemented to classify pixels. Additionally, design and test of an architecture for full object classification has been developed for the FPGA. High-speed transceivers are used to move data between multiple FPGA development boards. When paired with a hyperspectral camera and a monochrome line scan camera, this prototype system is capable of scanning objects in freefall and deciding within milliseconds whether or not to keep the object. This decision will dictate the action of air jets to displace unwanted objects. This full system is potentially of interest to small businesses or farms as it will enable farmers to perform their own premium bulk sorting in a cost effective manner. 1INTRODUCTION AND BACKGROUND Introduction A smart camera system is being developed to target sorting applications using a hyperspectral camera. The overall system in development includes the hyperspectral camera, a monochrome line scan camera and a sorting mechanism that uses air jets to perform the physical sorting. This camera system will replace existing systems by removing the need for cables between the camera and the processing unit as well as replacing conveyor belts and robots with a vibrator feeder and air jets. In doing so, with the help of the hyperspectral data, sorting may become more accurate and the unit may end up being cheaper and consequently more accessible to small businesses. This project is a prototype for the end result and is consequently not as compact as the final product is anticipated to be, but it performs all the necessary calculations and produces a result to trigger the air jets for the sorting of objects with high precision due to the inclusion of hyperspectral data. This smart camera system utilizes two System-on-Chip (SoC) devices that each consist of a Field Programmable Gate Array (FPGA) fabric and a Hard Processor System (HPS) implemented on a single silicon chip for easy and fast interactions. The fabric of these SoC FPGAs is used for the processing of all data generated by the cameras, while the HPS is utilized in user interactions and memory transfers. The monochrome line scan camera is included for detection of the objects at the time of imaging and building an object profile for the processing unit to make a complete object decision based on the compiled individual pixel decisions. The decisions are made based on classes designed around the hyperspectral characteristics found using the hyperspectral camera included in this system. 2Figure 1.1: A mock-up of the system as it is intended to operate. The product will fall from the conveyor belt and be imaged by both cameras (one high resolution monochrome and one hyperspectral) simultaneously before either continuing its fall or being ejected by air jets. This thesis focuses primarily on the development and implementation of the image processing algorithm in addition to the interaction between development boards. The air jet system is in development by a separate team of engineers, as is the monochrome camera processing subsystem that identifies object boundaries for the hyperspectral camera. In implementation of the prototype design for this project, the author of this thesis is responsible for the development and testing of the image processing algorithm for the hyperspectral camera data and the transceiver communication between development boards. The author also worked with the development tools to compile the whole project and fix timing errors. Additionally, 3this author set up the data access method between the FPGA and the off-chip Dynamic Random Access Memory (DRAM) connected directly to the FPGA fabric via dedicated and hardened DRAM controllers. The details of this system are abstracted away and a few control lines are available for use by other subsystems. Hyperspectral Imaging Resonon defines a hyperspectral image as a digital image with far more spectral information for each pixel than traditional color cameras. The resulting data can be pictured as a cube with dimensions in the spatial x and y directions and a third dimension in the spectral wavelength, as seen in Figure 1.2. The cameras utilized in the sorting applications explored within this thesis are line scan cameras, so a frame consists of a single line (spatial y = 1) of pixels (spatial x) and then the spectral bands occupying the ’third’ dimension. With the extra wavelength values, including those in the infrared, hyperspectral cameras are able to sense much more information than the human eye and your typical RGB color camera. This technology is used in anything from remote sensing to quality control to sorting [2]. Classifying a Hyperspectral Image Every pixel within an object contains nearly unique spectral signatures which can be used to classify it. In order to do so, a class is defined by compiling a variety of images of the object and determining the spectral signature that most commonly defines the pixels within the object. This is done for each of the possible classes expected to be seen in the surveyed objects. In this way, each class is a vector of values across all the spectral bands. Using these classes, a statistically- based machine learning algorithm is used in order to come up with a probability that the pixel belongs to each class. A number of different machine learning algorithms 4Figure 1.2: An example of hyperspectral line scan images over several frames (lines). [3] are acceptable for this objective, however a logistic regression approach has been chosen for the implementation of the prototype based on currently used systems. For logistic regression, the considered computation utilizes an inner product calculation between the spectral signature of the pixel and the considered classes to reach a probability. The highest probability is kept for each pixel in an object and by adding the probabilities of each class over the object, the highest probability is kept and determined to be the class to which the object belongs. Object Sorting Sorting has been a large concern in developed nations for many years as manufacturers strive to put out quality products. It is gaining popularity in developing countries and becoming even more important in the places where it already exists, as quality control is brought to the forefront of society’s attention. Particularly in the food industry, increased industrialization has brought forward a push toward healthy convenience foods. Sorting is also very important in agricultural applications 5as farmers need to sort their crops after harvesting. In many cases, this is done by an industrial company who then reports back with the percent loss due to the sorting mechanism. Of course, by sending it away, farmers have no way of verifying the reported loss and it would be easier and more reliable for them to have their own means of sorting. Machine sorting helps to avoid the inconsistent nature of manual sorting [4] and avoids significant loss of good product that may result from vibration sorting or other mechanical means. Sorting machines come in a variety of shapes, sizes, and technologies. These include using lasers, cameras, or x-ray in conjunction with robotic arms or air jet systems to sort products and separate the bad from the good [4]. As the technology advances, the sorting abilities will be expected to do so, as well. Figure 1.3: Robot sorting almonds [2] Smart Cameras In many applications, cameras of all varieties are used to acquire data for studying something about the subject matter that can be viewed at a later time. This 6data is also generally processed external to the system in which it abides. However, it is becoming more necessary and common for processing to happen on-board, enabling the system to adjust in real-time. Because of this, real-time processing is in high demand and the techniques are still being perfected. Further, the algorithms to process data generated by cameras are in constant refinement as researchers learn what they want to see from the data and how best to achieve those results. As algorithms are refined and cameras are built to generate more data than ever before, it becomes necessary to have the right infrastructure to support the real-time processing of imaging data, and thus we find the niche for a smart camera. Existing Smart Cameras There are several smart cameras currently in existence, including some with a programmable FPGA or select-able image processing algorithms to utilize in the desired application. These include several by Matrix Vision and some by other companies such as Matrox Imaging, EVT, and Teldyne Dalsa as further described in the following subsections. Though these cameras are likely very useful in some applications which require on-board image processing, they lack the real-time processing advantages gained in the use of the Arria 10 FPGA, which are detailed in the last subsection below. Matrix Vision Matrix Vision has created several smart cameras, two of which are notable for image processing in industry. The mvBlueGEMINI is touted as a ’Tool box technology camera’ and includes an SoC with FPGA and Dual-Core Cortex-A9 with 800 MHz- capable clocks and a camera sensor with 1280 x 1024 resolution. The software that 7comes with the camera includes a Graphical User Interface (GUI) with which users can choose the task to complete. The frame rate on this single sensor is unspecified. [5] The other smart camera by Matrix Vision is the mvBlueLYNX-X. This camera has options for either CCD or CMOS sensors in addition to a hybrid dual core. This features a Cortex-A8 ARM CPU with a separate real-time Digital Signal Processing (DSP) unit with video interface. The CPU has a clock speed up to 1 GHz, while the DSP can run up to 800 MHz. Though available in a number of different resolutions, the largest, 2592 x 1944 has a maximum frame rate of 14.4 Hz and the next largest, 1280 x 1024 has a maximum frame rate of 60 Hz. [6] Matrox Imaging Matrox makes a smart camera entitled the Iris GT which comes with a design assistant and a web-based interface for the integrated development environment. This camera has an Intel Atom embedded processor running Windows as well as a built- in keyboard, monitor, and mouse for friendly user interface. It is compatible with a variety of monochrome and color CCD sensors. Matrox claims this camera and software are ideal for a variety of machine vision applications including agriculture, aerospace, and more. The highest frame rate is 110 frames per second (fps), with an effective resolution of 640 x 480. [7] Eye Vision Technology Eye Vision Technology (EVT) creates several variations of smart cameras. The RazerCam, for instance, is packaged with a free programmable FPGA and two ARM Cores based on the Xilinx Zynq SoC. Users are limited to choosing between one of two matrix sensors or a line scan sensor with 4K resolution. That line sensor claims a frame rate of 10000 fps with 10-bit pixel data, but the matrix sensors are not above 60 fps. The ARM cores are running Linux for user convenience in interaction. The 8greatest downfall of this camera is the lack of hardened floating point on the FPGA, which could hinder the speed or accuracy of results, not to mention development speed. EVT also has a series of EyeCheck smart cameras which are almost all around 30 or 60 fps at resolutions in the thousands. One version has 180 fps and a Xilinx Artix-7 FPGA with 28K logic elements. This FPGA is in the low-end of Xilinx’s product portfolio, designed to optimize power and cost. [8] Teledyne Dalsa Teldyne Dalsa offers several vision systems with embedded software applications. There are monochrome sensors available with resolution up to 1600 x 1200. The processing includes an embedded CPU and DSP with a choice of software. These are built robustly for integration in factory environments. These cameras are ideal for still-image quality control and do not have the high clock rate possible with an FPGA. [9] The Winner is ... None of the Above As seen here, there are many different options for smart cameras already on the market that could be fitted with a hyperspectral front end and used for sorting objects. But ultimately, none of these were chosen. This is because they lack what could be known as the best combination of options. Some of these are outfitted with DSP software and pre-programmed algorithms to choose from. Others use FPGAs for user configurability. However, the DSP software is not all-encompassing and the FPGAs more than likely do not have hardened floating point blocks. In the application space targeted here, the hardened floating point is particularly valuable for ’cheaper’ calculations with greater accuracy. Further, by using only an FPGA to do all the processing, any algorithm could be configured and used. Real-time processing also greatly benefits from the deterministic latencies of FPGAs whereas other systems are 9compromised by the inclusion of numerous memory accesses or operating systems. Additionally, the sensors available for these cameras have frame rates less than 100 fps in most cases and the ideal sensor will be collecting data much faster than that. For these reasons, it was decided that a new smart camera should be developed and thus, this project was born. 10 MOTIVATION Beneficiaries This project is done in support of, and with support from, Resonon Inc. in the belief that they will be able to utilize the smart camera in their machine vision technology systems. Upon completion of a system prototype, of which this thesis is a subsection, they could utilize the processing method and small form factor in other integrated systems that they pair with their optical technology. Further, the Montana Board of Research and Commercialization Technology helped to start the work on this project and its first proof of concept iteration as they were providing the primary source of funding for materials and man-hours spent developing this technology implementation. The primary focus for this smart camera implementation is in food sorting, but the technology could be utilized in any sort of assembly-line environment requiring quality assurance checks. Currently, in areas using the Resonon machine vision technology, there is still the need for manual sorting after the machine has performed its sorting because the current system is not capable of processing all the necessary data at a sufficiently fast speed in order to make highly accurate classifications. Due to the unavailability of a suitable image processing system, the images are lower resolution than the Resonon optics technology is ultimately capable of in order to allow processing to be done on a traditional PC. The goal of having an efficient real-time integrated machine sorting system that is able to process high-resolution images, is to eliminate the need for manual sorting after the machine which will free up workers for other tasks. Using an FPGA enables a fully customizable smart camera implementation that could apply in several application areas. 11 Current Processing System A typical image processing system is shown in Figure 2.1. This system is comprised of a camera, a frame grabber to configure and capture image data from the camera, and a computer to perform the processing. Figure 2.1: Depiction of a typical image processing system [1]. Though this method has worked for many years, it limits the speed at which images can be processed due to several bottlenecks. The first bottleneck comes from the cables that limit bandwidth. The second bottleneck is the speed of the computer that limits the speed of real-time computations. The previous proof-of-concept system utilized Camera Link connections to connect the camera to the FPGA. The Camera Link standard was designed modeling the Channel Link technology, which is able to transmit data at up to 2.38 Gbps [10]. With current applications of image processing, the need for real-time results is growing and placing a strain on the capabilities of existing systems. This project seeks to provide a solution for the replacement of these traditional systems by integrating all three components as shown in Figure 2.2 and removing the need for any cables. The proposed implementation involves short ribbon cables to move data between the board housing the camera sensor and the FPGA board. This is done so that the camera sensor can be easily placed at a 90◦ angle 12 to the board for this prototype. These ribbon cables will eventually be replaced by board-edge connectors since the cables are not required for implementation purposes, as far as the data movement is concerned and could easily be removed in a final product. Figure 2.2: This is a depiction of the future of image processing, with an integrated camera sensor and FPGA processor [1]. Why FPGA? One of the biggest advantages of FPGAs over standard computers is the deterministic low-latency data paths achievable in custom application-specific archi- tectures. CPUs have fixed architectures and variable latency depending on where the data is stored or moved (cache, main memory, etc.). FPGAs are made of programmable logic blocks, SRAM, and DSP blocks that can be reconfigured for varying applications. The architecture of an FPGA is like a grid, with logic connected over interconnects between blocks. Because of this structure, FPGAs have no cache and a flexible architecture can be programmed by the user using a hardware description language. In doing so, ultimate control is maintained over what occurs on each clock cycle and even where each of the internal registers are placed to garner the best path through the device. The deterministic latency is key for real-time systems 13 as the user is able to guarantee that the performance is real-time. Fabric is also easily expanded by adding more logic blocks, which enables manufacturers to create similar devices of varying size and complexity. In this way, FPGAs have been tailored to be suitable for a whole market of people with varying needs, resources, and cost targets. In this project two SoCs are used, instead of the standard FPGA that does not include the ARM CPUs. A SoC contains a dual-core ARM processor on the same chip as the FPGA along with hardened peripheral (Ethernet, USB, etc.), which is referred to as a Hard Processor System (HPS). Providing an HPS that can serve as a smart interface between the FPGA logic and the outside world makes it easier to communicate with external computers for the passing of data. This is generally accomplished using the Ethernet connection to achieve an IP address on the Linux system running on the HPS. However, while the HPS is still functioning as a standard computer and is subject to timing constraints applied by the OS scheduler, the FPGA can continually be running and performing the computationally demanding or timing specific tasks concurrently. It is able to send interrupts to the HPS and the HPS can read or write to the memory available to the FPGA. A depiction of the resources available and their relative locations within the Arria 10 SoC chip is shown in Figure 2.3. The architecture of the Arria V SoC is similarly laid out, though with lower-level technology in the transceivers and DSP blocks. 14 Figure 2.3: Graphical depiction of relative resources in the Arria 10 SoC chip [11] 15 SYSTEM DESIGN Logistic Regression Algorithm One of the simplest hardware-implemented classification algorithms is logistic regression. The inputs are a vector each of classification coefficients, means, and standard deviations, the input data, and matrices of a full bright image and a full dark image. The equations describing the process are: normalized = (data− dark)/white (3.1) corrected = (normalized−mean)/standard dev (3.2) product = coefficient ∗ corrected+ previous (3.3) In these equations, previous represents the running sum. It starts with the zeroth coefficient value and subsequent products are added on. An example is shown here assuming input vectors of the corrected data and the coefficients. The coefficients vector is one more in length than the input vector. 1 for i in 1 to NUMBER BINS 2 i f i = 1 3 product = c o e f f i c i e n t ( i ) + co r r e c t ed ( i ) ∗ c o e f f i c i e n t ( i +1) ; 4 else 5 product = product + co r r e c t ed ( i ) ∗ c o e f f i c i e n t ( i +1) ; 6 end 7 end The final step of the logistic regression calculation includes calculating the probability associated with the inner product result. This probability calculation is as in Equation 3.4 where X = product after the inner product is completed. For this system, the classification is not dependent on the probability value, only on the relative probability. Given the one-to-one mapping between the inner product result and the probability, it is sufficient to use the inner product result as a representative 16 of the relative probability for each class to determine the class that each pixel most likely belongs to. P = 1 1− e−X (3.4) To ensure that the computations are hardware-friendly, only multiplications and additions are implemented. This means, that any numbers needing to be divided are first inverted in the software before being ported to the hardware. The computation of logistic regression involves matrix inner products. Given normalized inputs and classification coefficients, the inputs are multiplied by the corresponding coefficients and a running sum is kept over a pixel to achieve a single result representing the probability that the pixel belongs to that class. Figure 3.1: Inner product calculation with vector of coefficients and matrix of normalized values with dimensions of number of bins by number of pixels. The normalization that takes place uses the white and dark values that are passed with each data input as well as stored mean and standard deviation values that represent the mean and standard deviation across the spectral bins from a training set. The white and dark values are used to normally distribute the data between 0 and 1, while the mean and standard deviation account for the frequency of particular values. The dark value is subtracted from the data and the result is multiplied by the inverse white value. Subsequent operations involve subtraction of the mean and multiplication by the inverted standard deviation. The inverse of both the white values and the standard deviations are calculated externally before being stored in 17 system memory so no hardware divides are required, enabling fewer resources used and faster clocks. Hardware Elements Arria 10 SoC The Arria 10 SoC by Altera (which was acquired by Intel in 2015) was chosen for the primary computation engine on this project for several reasons. The primary reason is its hardened floating point units which enables the device to allow for over 1.5 trillion floating point operations per second of performance [12]. This is the first device on the market to contain single-precision floating-point multipliers and adders incorporated into the hard DSP blocks [12], the addition of which provides a great improvement in system development since fixed-point algorithms take much more effort to develop and soft floating-point calculations use unnecessary amounts of resources to create floating-point multipliers in the FPGA fabric. In addition, the largest device in this family has 660K logic elements (LE), over 42Mb M20K memory, and up to 48 transceivers capable of 17.4 Gbps [11]. This device is the best middle-class FPGA on the market today. Future iterations of this system could utilize different versions of the Arria 10 or move to a higher-end device in the Stratix 10 (the largest of which will have around 30 billion transistors [13]). The Arria 10 is the best FPGA for this purpose right now because of its high performance, which surpasses the speed requirements of the cameras, for data alignment purposes while still maintaining affordability. Also, the Stratix devices, while better in number of logic elements and DSP performance, really excel in transceiver performance and are best suited for tasks involving high transmission. Though this design does use transceivers, and may benefit in the future from moving to these more advanced 18 devices, it is not necessary to have the higher capability given the limit of the data rate from the cameras. Development Board Components In addition to the Arria 10, other components on the development board that were utilized for this project include the DDR3 DRAM, the SMA connectors, and the FMC connector. The DRAM on the board is 1 GB of memory for each of the FPGA and HPS to utilize. This is used for storing the light and dark matrix values. The SMA connectors are used as the interface for the transceivers to communicate with the Arria V FPGA. A daughter card was developed to plug into the FMC for the purposes of bringing camera data into the FPGA. Also an important part of the project is the Arria V SoC, which is also on a development board that includes SMA connectors, an FMC connector, a Max V CPLD, and a programmable oscillator. The SMA connectors here are again the interface for the transceivers. The FMC is used for the custom daughter card to connect the monochrome camera to the FPGA and the oscillator is utilized for achieving the ideal clock frequency for the transceiver communications. The oscillator is programmed over I2C by the Max V, which has to be programmed separately prior to running the desired program on the FPGA. The high level diagram of system components is shown in Figure 3.2. Not shown is the external PC which will interact with the HPS over Ethernet. While in future system implementations, the Arria V will be replaced with an Arria 10, it is currently used for the monochrome camera subsystem because of its initial use in the development of this subsystem. Additional Custom Boards In addition to the two FPGA development boards the project also required the creation of several daughter cards, i.e. printed circuit boards (PCBs). Three cards 19 Figure 3.2: High level view of the components external to the SoC utilized in the system. The colored regions depict the individual PCBs. were developed using the PADS software from Mentor Graphics [14], by teammates Connor Dack and Alex Matejunas. The first card is designated the ’sensor board’, which contains all of the circuitry to connect to the lines of the CMOS sensor chosen to be the face for the hyperspectral front-end. All of the lines are drawn out to two 100-pin ribbon cable connectors. This board is separate for the purposes of being able to orient at a 90◦ angle to the rest of the boards, but also so it is modular and could be easily swapped with a different sensor, should the need or desire arise. A second board contains more ribbon cable connectors and connects the data from the cables to the FMC, which will bring it into the FPGA for processing. This board also contains circuitry for the transceivers, including SMA connectors and a clock generator to provide a reference clock, since the Arria 10 development board does not contain any SMA connectors for transceivers. Both boards are shown in Figure 3.3 connected to the Arria 10 pre-production development board. 20 Figure 3.3: The PCBs created for the hyperspectral camera connection to the Arria 10 FPGA. They are shown attached to the FMC, without the ribbon cables and coaxial cables. A board was also created to connect to the FMC on the Arria V with inputs for the monochrome line scan camera’s Camera Link cables. A custom board was created for this purpose because not all of the FMC pins are connected on the Arria V development board, though they are needed for communication with the camera. Consequently, a daughter card was developed to appropriately map the camera signals to connected FMC pins on the FPGA. Project Overview In order to implement a processing system on the FPGA, the tasks required were broken into system blocks as detailed in the following sections. These blocks are the camera interface, the DRAM interface, the computation block, and the FPGA- to-FPGA interface, which encompasses the communication system between the two boards in order to send signals to the air jet system and transmit object information. 21 Figure 3.4: Block diagram of the full system functionality Camera Interface In order to integrate the camera sensor on this prototype system, two additional boards were fabricated. The first board houses the sensor and has connectors for the data to pass to the second board, which is connected to the development board housing the FPGA, and routes all the data signals to the appropriate places to be accessed from the FPGA as well as ensuring the control signals are appropriately routed. This board also contains clock generator circuitry and SMA connectors for transceiver communication purposes. As previously mentioned, the two boards are connected via ribbon cable in the prototype to allow the sensor to be at a 90◦ angle from the other boards. Configuring the sensor on its own board makes this a modular product, in which other sensors (on their respective boards) could be used to replace the current one so long as the signals are routed in the same way through the connectors. 22 Figure 3.5: Block diagram of the camera interface subsystem. The programmable hardware interface for the camera consists of a state machine to compile all the bits per pixel as they are presented, and attaching location information which describes which pixel and spectral band the data belongs to. It also pulls the data from DRAM through a FIFO interface and verifies the location information matches that of the pixel that is being compiled. This interface also sends any control signals to the camera required for triggering a start and providing a clock signal to the camera. The latency in the Camera Interface is determined based on the camera data rate as well as the number of cycles needed to delay the data before it is passed on in order for it to be parallelized. Since the data is presented in 10 taps, one bit at a time, there is a delay needed to accumulate all the relevant bits per pixel per band. Additionally, to create 5 parallel channels, the data is delayed because it is initially presented serially. It was chosen to add this delay and parallelize because while it slows down the initial presentation of the data to the computation unit, the increase in computations completed through parallel channels is great enough to overcome this initial serial latency. 23 DRAM Interface In order to account for the effects of the camera, all incoming data is normalized by the white and dark values as in Equation 3.1. These are meant to correspond to the largest and smallest possible data values that have been, or could be, seen. This data is captured in still images taken prior to operation of the system. The dark image is taken while the lens cap is on the camera to provide a theoretical darkest environment possible. In contrast, the light image is captured while the camera is fixed on a white strip that is lit up to its brightest value seen given the operational lighting conditions. Given that these are full matrices, with potentially variant values at each pixel/band, all the information captured needs to be stored. Due to its size, it was decided to store this data in off-chip Dynamic Random Access Memory (DRAM) so that there is still plenty of room for more frequently accessed and changing data in the on-chip SRAM memory. This was also deemed an acceptable choice of storage location because the values are only accessed before the computations and on-chip memory is used to buffer the values as they are accessed, so there is less time-critical need of the data from the time of address specification (i.e. the DRAM matrix values are pre-fetched, alleviating any effects of DRAM refresh stalls). DRAM consists of a grid of capacitors and transistors where each capacitor is capable of storing a single bit based on its voltage level. The transistor is used to access that particular capacitor and charge or drain it as necessary. The memory has to be refreshed occasionally to keep the stored values as the capacitors drain their charge over time. Since each stored bit only requires a single transistor and capacitor, this memory is very dense and cheap, making it attractive in industrial applications. However, DRAM is not quick to access in comparison to SRAM that is located in the FPGA fabric. The timing of controller interactions with the memory is also 24 Figure 3.6: Block diagram of the memory subsystem. very technically challenging, but is handled through a hardened memory controller in the FPGA that provides the direct interface functionality. A custom controller was developed to be the master of this interface and this controller issues read or write commands. As camera data enters the system, prior to being processed in the computation subsystem, it needs to be properly aligned with the corresponding location’s white and dark matrix values. In order to accomplish this, the matrix values are pulled from memory sequentially in the same order that camera data is received. It is stored in such a way that this order can be accessed sequentially in the memory. By doing so, bursting access, reading multiple address locations in multiple sequential memory accesses, can be used to take advantage of the structure of DRAM. A pre-buffering system is in place to hold the bursting read data and enable alignment with the data 25 that is coming in faster than the DRAM can be accessed, if each location were to be read individually. The white and dark values for each location are stored together, requiring only one memory access per location. This was done to facilitate ease of access and use as both values need to be aligned with the incoming data. Additionally, the bus between DRAM and the FPGA can sustain signals of the width needed to accommodate each of the data values and the location (see Table 3.1). This buffer’s output is made available to the camera interface to enable the data alignment with incoming values. Table 3.1: Information Associated With a Pixel in DRAM 127 0 31 ZERO PADDING 0 31 LOCATION 0 31 DARK 0 31 WHITE 0 Computation Unit The pixel and object classifications are done in the computation unit. This block consists of the normalization step as well as the inner product engine to complete the classification. It also compiles a full object classification, sorts the results, and makes a decision at the end. The system is using linear regression to classify the pixels, as was introduced in Section 3, Logistic Regression Algorithm. Presented below is a detailed explanation of the block to classify the pixels, then subsequently the objects. Pixel Classification In order to perform the logistic regression calculation on the incoming pixels, there are a number of parallel blocks implemented. The first is the normalization which performs the calculations in Equations 3.1 and 3.2 on the incom- ing data in parallel. At this step there is a DSP block per calculation step per parallel data channel. The mean and standard deviation values are stored in on-chip RAM for easy access. The output of the normalization is passed to the inner product blocks. 26 Figure 3.7: Block diagram of the computation subsystem. There is an inner product block for each class within each parallel channel. This corresponds to NUMBER OF CLASSES*NUMBER OF PARALLEL CHANNELS DSP blocks, as each inner product requires only one DSP unit. For this prototype design, that means there are 16 ∗ 5 = 80 parallel DSP blocks used for the inner product. The class coefficients are located in memory blocks for each class, addressed by band number. At the end of each pixel, the results of the inner product blocks for each parallel channel are added together to have one result per class. The output, then, is a vector of probabilities designating how likely it is that the pixel belongs to each class. This information is stored in on-chip memory for access by the user, as well as being passed to the object classification subsystem. The computational complexity of this classification is found by analyzing the number of operations that could be happening at any one time. Once the system is fully in operation, all of the normalize DSP blocks and inner product blocks can be running at once. Assuming this is the case, the performance of the pixel classification when running on the 70 MHz data clock with a 210 MHz operation clock is 4.55 GFLOPS. Concurrently, the on-chip memory bandwidth can be analyzed for each of the instantiated memory blocks. Taking in to consideration the blocks for the 27 means, standard deviations, classes, intercepts, and results, the total on-chip memory bandwidth is 44.8 GBytes/s. Much of the data pulled from these memory blocks is actually not used, but is required to fulfill acceptable memory port width ratios. This number was found using the 70 MHz clock rate that is used to read or write to the memory on the processing side of the system. Object Classification In classifying the full object, data is utilized from the monochrome line scan camera as well as the pixel classifications obtained using the hyperspectral data. The line scan camera is taking images at 80,000 frames (lines) per second (fps) and the Arria V FPGA is performing calculations to find the location of an object. The line scan line number and pixel number are translated into the corresponding hyperspectral numbers on the Arria V to prevent transmission of numerous repetitive entries. This translation is done because the monochrome pixels are at a much finer resolution than the hyperspectral pixels. Information about the object’s location is transmitted to the computation block including the line number, an object number, and the beginning and ending pixel in that line that defines the object. A record is kept in the computation unit on the Arria 10 of object locations, which is used to accumulate pixel classifications within each object. After an object disappears from the scan line, the overall results are used in a lookup table to determine if the highest level classification probability is good or bad, though future systems could look at the accumulated class probabilities of all the classes to make a decision. The decision to eject the object is made off of this lookup and sent to the Arria V, which is also controlling the air jets. The final sorted results are made available to the HPS through a streaming process which feeds a modular Scatter Gather Direct Memory Access (mSGDMA) block that will write streaming data to SRAM belonging to the HPS. 28 Since not every pixel will need to be accumulated into an object, and pixel results do not show up every clock edge, there is only one DSP block implemented in this section. If it were constantly in operation, it would achieve a performance of 70 MFLOPS. The memory block that holds the accumulated results for each object has a theoretical maximum bandwidth of 4.48 GBytes/s. This is theoretical because, as with the adder, there will not be a memory access on every clock period due to the intermittent nature of the data that will be accumulated. The VHDL code for operation of the computation subsystem can be found in Appendix B. These files are • computation unit.vhd – regression.vhd ∗ normalize.vhd ∗ channel sum.vhd – sort.vhd – object tracking.vhd FPGA Interface In order to communicate between the two camera subsystems, a high-speed (6 Gbps) serial interface was designed to connect the Arria V and Arria 10 FPGAs. The monochrome line scan camera is connected to, and its data is processed on, the Arria V while the hyperspectral camera is connected to the Arria 10 where the inner product computations for classification occur. In order to make a full object classification, the information about each object’s location needs to be passed from the Arria V to the Arria 10 and the ultimate decision to keep or discard each object is sent back from the Arria 10 to the Arria V, which also controls the air jet system. One reason for 29 Figure 3.8: The connection between the Arria V development board (top) and the Arria 10 development board (bottom). the two separate boards is the availability of FPGA Mezzanine Connectors (FMC) on the development boards. The monochrome line scan camera requires two Camera Link cables between the Arria V and the camera. The FMCs are connected to the FPGA in such a way that both connectors are required to connect all the desired signals for the camera. There are only two FMCs on the Arria V development board, and consequently, both are used for this camera. The hyperspectral camera also uses an FMC to connect to the FPGA. Since the Arria 10 is the larger device and includes hardened floating point, it is imperative that this board is used for connecting to the 30 hyperspectral camera. Since a daughter card has been developed such that all the correct signals are routed to one FMC for the monochrome camera, both cameras could be connected to the Arria V but not the Arria 10 as the available board only has one FMC. The Arria 10 board currently contains an engineering sample (pre- production) version of the Arria 10 since we have not been able to get a production evaluation board of the Arria 10. The production evaluation board will have two FMCs and at that time, the full design can be ported to one board, provided the logic will fit on the device, eliminating the need for the communication interface. Unknown at this point is if the Arria 10 has enough resources to fit the full design or if two Arria 10 FPGAs will be needed. The communication interface between the two FPGA boards is through SMA connectors connected to coax cables that use the high-speed transceivers in each of the FPGAs. Using the SerialLite2 protocol, the transceivers can establish a link and transmit data. SerialLite is a communication protocol that is particularly good for high-speed serial communication and has less overhead than other serial protocols. The protocol includes CRC checking as well as optional scrambling/descrambling of the data. It can also be used with multiple receivers and a broadcast mode, if desired. Though the SerialLite2 IP core provided by Altera is not yet readily available for the Arria 10, it is indirectly supported. Since SerialLite3, which is available for the Arria 10, is not compatible with SerialLite2 due to different encoding schemes and packet structure differences, the SerialLite2 core was needed to be able to communicate between the two boards. In addition to the SerialLite2 core, the native transceiver phy cores are used for their respective devices. These implement the PMA (physical medium attachment) aspect of the transceivers as well as some of the PCS (physical coding sublayer) and handle the physical transmission of the data. 31 The SerialLite2 core then sits on top of the native core and handles additional PCS tasks of transmission, such as providing a CRC for the data. In order to set up the reference clock, which is required to be 156.25 MHz to be compatible with the general transceiver accepted reference clocks, a programmable clock was needed. At first, this was done using the programmable oscillator on the Arria V development board, which provides the reference clock to the transceivers that are connected to the SMA connectors on the board, and is programmable over I2C from the Max V CPLD controller. After the FMC breakout board was completed, in order to provide additional SMA connectors for testing of the transceiver channel, the clock generator on the breakout board had to be programmed over I2C from the FPGA to generate the desired frequency clock. The clock generator chosen for this purpose has four one-time programmable (OTP) configurations, so that the correct frequency can be loaded on power-up. After programming the volatile RAM with the desired values, they were burned into a configuration on the OTP memory of the generator and subsequent projects need only enable the output of the clock generator to get the desired frequency. This made it much easier to ensure the right frequency was available at transmission time, rather than programming the clock each time the power was cycled. On the Arria 10 pre-production development board, the FMC breakout board can be used, however the reference clocks connected to the clock generator outputs do not connect to the reference clock inputs on the same bank as the populated transceivers. One of the reference clocks on this bank is provided by a programmable oscillator on the development board that has approximately 10 clock outputs. Instead of programming yet another oscillator, the SMA clock outputs were found to provide a 156.25 MHz clock that can be transmitted over SMA to one of the receivers to be used as a reference clock on the breakout board. 32 Performance A significant benefit of this system is the increase in performance from the previous method of processing. Previously, Resonon has been using a camera with a frame rate of 140 fps with a spatial resolution of 640 pixels and a spectral resolution of 240 bands. This system under development is comprised of a 500 fps camera sensor (full resolution) running at 2000 fps (partial resolution) with a spatial resolution of 1024 pixels (reduced to 256) and a spectral resolution of 160 bands. A large increase in spectral bands was neither needed nor desired by Resonon because they have found that the data becomes redundant and unuseful after a certain point. With a clock speed of 70 MHz on the computation side and approximately 157 cycles to classify a pixel, this means that it takes only 1.57µs to compute the classification for a full pixel. With 1024 pixels, this hyperspectral computation takes 1.6ms per frame (i.e. line scan). The monochrome line scan camera can run up to 80,000 lines per second, and the transmission rate between the two boards is at 6250 Mbps. With 54 bits needed to represent the information per object per line, objects are transmitted at a rate of 115.74 MHz. When packaged in 32-bit data words and including start and end packets, the transmission is still accomplished in 20.48ns. Since the monochrome line scan calculations can run on an 83.5 MHz clock, it is able to keep the transmission buffer full (i.e. calculations take place faster than they are needed), but not overflowing since there will not be objects found on every clock edge. The decision on the hyperspectral side is made using the 70 MHz clock that the computation unit runs on, so there may be some dead spots in the return transmission. Upon receipt of the object information, the line and pixel numbers are stored using the object number in an array updated with each transmission to note where objects are on the line. 33 IMPLEMENTATION DETAILS Programmable Oscillator In order to achieve an accepted clock frequency for the transceiver reference clock on the Arria V board, a clock generator had to be programmed. The first iteration involved programming the programmable oscillator, a Si570 device from Silicon Labs, provided on the development board. In order to achieve the desired frequency of 156.25 MHz, 6 of the available registers are required to be programmed via the I2C lines which are connected to the MAX V CPLD system controller that is also on the board. The oscillator does not have persistent memory, so it must be reprogrammed after every power loss to consistently have the desired frequency on every run of the device. This can be arranged by programming the Max V to run the I2C code as part of the device configuration. Programming the oscillator requires knowledge of the current frequency and register values as the calculations for new values are based off the current configuration. These default values and the new values were obtained using the Clock Controller GUI provided as part of the board test system from Altera. Registers The device has two sets of identical registers, one set for devices with 20 or 50 ppm temperature stability, and the other for devices with 7 ppm temperature stability. The oscillator provided has 7 ppm temperature stability, 20 ppm total stability as determined by the part number. The critical values needed to program the registers are the output divider values (N1 and HS DIV ) and the crystal frequency multiplication ratio (RFREQ). The output dividers are found by changing the existing values as little as possible, but keeping the digitally controlled oscillator (DCO) frequency within the acceptable range of operation. The factory default is 34 a 100 MHz clock with divider values and DCO frequency as shown in Figure 4.1. Using the GUI, the necessary values to program were easily obtained as shown in Figure 4.2. Though provided by this tool, they could also be found using a couple of equations, which were utilized in the MATLAB script created to print the VHDL constants for the programming of the registers. Based on the required values, all the registers needed to be programmed with values as shown in Table 4.1. The steps to derive these values are in Equations 4.1, 4.2, and 4.3 [15]. The RFREQ value is a 38-bit number with 28 decimal places, so is divided by 228 to achieve the correct decimal value prior to performing the calculations and multiplied by 228 at the end in order to shift the decimal accordingly. The values for HS DIV and N1 are chosen from a selection of allowed values with the goal of minimizing the DCO frequency (fdco) within an acceptable range, and also achieving the lowest possible N1 and the highest HS DIV . fxtal = (f0 ∗HS DIV ∗N1)/RFREQ (4.1) fdco = f1 ∗HS DIV ∗N1 (4.2) RFREQ = (fdco/fxtal) ∗ 228 (4.3) Table 4.1: Register settings for Si570 Register Number Old Value (Hex) New Value (Hex) 13 22 A0 14 42 C3 15 BC 13 16 30 B7 17 EE 0C 18 FA D9 35 Figure 4.1: Factory Default Clock Register Settings for Si570 Figure 4.2: Preferred Clock Register Settings for Si570 In order to perform the programming of the device, an I2C master component was utilized, provided by Scott Larson on EE Wiki [16]. A state machine was devised to progress through each of the registers and start individual transactions with the master driver. Following each write, a stop is sent, rather than continuously writing in order to ensure that the correct register is written to each time. Since all registers are written sequentially, this is not a strictly necessary course of action and all registers could have been written in a streaming write sequence, but using 36 individual transactions ensures that a specific register receives the data designated for it. This also set up the state machine in a useful manner for the clock generator, which does not require programming of all registers. The code for the implemented driver can be found in Appendix B under i2c driver.vhd. Programmable Clock Generator In order to further test the transceiver communication, two sets of transceivers were required. Since the development board for the Arria V only contains one set of SMA connectors and the Arria 10 engineering sample development board does not contain any, a daughter card was fabricated to utilize the transceivers through the FMC connector, with SMA connections. In order to achieve a viable reference clock on the transceivers utilized by the daughter card, a clock generator was included on the card along with the necessary circuitry. The VersaClock 6 Low Power Programmable Clock Generator from Integrated Device Technology was chosen because it is programmable over I2C, it has two configurable clock outputs, and it has the option for four one-time programmable configurations stored in non-volatile memory. The one-time programmable configurations are appealing in this project because it does not require any setup once the configuration has been programmed; the required frequency will be available on power-up of the device, unlike with the oscillator on the development board. Design Decisions Many of the additional circuitry required by the clock generator is specified in the datasheet, with recommendations such as using a 25 MHz crystal, and terminations for different output configurations [17]. One of the design decision made includes the connections of the I2C lines and the select line, pins 8, 9, and 24 respectively as 37 shown in Figure 4.3. Pin 24, OUT0 SEL I2CB is used to determine whether pins 8 and 9 will be select lines for one of the four stored configurations or the clock and data lines for I2C communication. If connected to a pull-up resistor, they will be select lines, otherwise, they will be used for I2C. Consequently, a pad was placed on the PCB to enable a pull-up to be used, but it was not populated so the device could be programmed over I2C. After power-up, this pin also serves as a clock output, acting as a buffer for the selected reference clock [17]. Each of the clock outputs is connected to a reference clock pin for the transceivers through the FMC connector and one of them is also connected to the global clock network for use in FPGA logic, if desired. Figure 4.3: Diagram of Pin Assignments for VersaClock 6 Programmable Clock Generator [17] 38 Registers The VersaClock Clock Generator has registers programmable for four output clocks, despite the fact that there are only two output clocks available on the device, in addition to the reference clock output. The registers available to be programmed include settings for the internal PLL divider and output dividers, both integer and fractional. There is also the option to choose between the crystal reference and a reference clock provided by the FPGA. The pins are shown in Figure 4.3. The registers chosen to program include those for the programmable capacitors, the internal PLL frequency dividers, and the output dividers. The values for the programmable tuning capacitors were chosen based on Equation 4.4 [17] with an estimated combined stray and external capacitance of 2 pF. Several values were tested to verify the values, but there was not a large noticeable difference between any of the results, as seen on an oscilloscope, so the originally designated values were kept. In choosing the values for the PLL frequency dividers and the output frequency dividers, a voltage controlled oscillator (VCO) frequency of 1250 MHz was targeted, which is the lower bound of the desired range for the oscillator. Using this value with the known expected output frequency of 156.25 MHz meant there was no fractional divider values for the PLL or the output, which means fewer registers to program in addition to a more accurate clock division. A MATLAB script was used to print out the desired register configurations and the resulting VHDL code is included in Appendix A. CL = (9pF + 0.5pF ∗XTAL[5 : 0] + Cs+ Ce)/2, (4.4) Burning a Configuration Unlike with the programmable oscillator on the development board, the clock generator has the ability to hold four non-volatile configurations. The benefit of 39 using a non-volatile configuration, is that the clock output is available very soon after power-up, without having to re-program the generator each time. In order to burn a configuration, all the registers in RAM were set to the desired values, the VCO was calibrated, and then the registers designated for control of the OTP were programmed to define the registers to burn and then check to be sure that the burn completed successfully. By setting bit 7 in the OTP Control register, the part will automatically load data from OTP on power-up. Utilizing the Clock Generator With the configuration needed burned into the part and automatically loaded on power-up, the only thing needed to ensure that the clock can be used by the transceivers is to enable the output and select the appropriate reference clock. For the default configuration burned, the default reference is the crystal input at 25 MHz. The enable and select signals are both driven low on pins 6 and 7. Altera IP Within the Quartus software, Altera provides many different IP blocks as ”Megafunctions” that can be customized and dropped in a design. These make handling transmission interfaces or creating memory blocks much simpler. However, in our efforts to make the system as modular as possible, some of these had to be bypassed and implemented by hand. Fortunately, the compiler will synthesize the components and create the desired blocks even when not created in a megafunction. One of the things that require care when writing the block by hand, however, is the rules of the block. For instance, a dual port memory is very tricky to implement by hand, as it cannot have arbitrary values on either side of the block. A benefit of creating the unit within the Megawizard, is the tools will inform the user of valid 40 values for each of the parameters. Without this interface, users must carefully choose their values or learn of a fail when the design is compiled. This was encountered in the memory block instantiations used within the computation subsystem. In order to avoid creating multiple different memory components and also in an effort to create a modular design, a memory block component was created that instantiates the Altera altsyncram megafunction with generic parameters that can be input at the time of instantiation. This is a perfectly reasonable approach until a port width ratio is violated. Rule violations happened several times over the course of development and fixing them resulted in the creation of extra locations within the memory block that were skipped on one port and ignored on the other, but required to be there to enable the port ratios to work within the block. This is an unfortunate waste of memory but not a huge concern for the design as it stands currently. Timing Constraints The hyperspectral camera is able to produce data at 6.6 Gbps with 500 fps when using the full 1280x1024 pixel image. Having reduced the image size to 256 pixels for this application, the frame rate is up to 2014 fps. After compilation of the parallel data streams, it will be passed into the computation unit at 66 MHz. In order to stay ahead of the incoming data, the computation unit needs a base clock at least this fast, though preferably faster. Fortunately, faster is possible. The base clock in the regression unit is targeted at 70 MHz. One of the tricks in running faster than data is produced is to ensure that the blank times are not affecting the overall results, since the unit is constantly adding in new values over each pixel. Therefore, a signal was added to classify each incoming data chunk as valid or not. Using this signal, the computation unit determines whether or not the value should be added into the existing calculations. The valid signal is not, however, the antithesis to the error 41 signal passed by the camera block. Error is set when the incoming data is bad or the location of the white/dark matrix values does not line up with the location of the incoming data. In this case, the pixel currently being calculated is zeroed out and no incoming data is considered until the start of the next pixel. The design uses two primary clocks for the computations and classification, one with a frequency three times faster than the other. The slower is required to keep up with the incoming data rate. The triple speed clock was included when it was found that the floating point adder and multiplier each take three cycles to complete. In the original design of the inner product unit, a multiplier was pipelined with an adder, but the result from the adder was needed as an input to the multiplier for the following calculation. This was a carryover from a previous implementation which received data from each spectral bin at a time, rather than from each pixel. Once the design was changed to accommodate data arriving for a full pixel before moving on, the inner product unit could also be changed. Quartus provides a multiply-accumulate floating point megafunction that completes in four cycles. Using this, the faster clock was no longer needed in this unit and data alignment was much easier. The faster clock was kept, though, and utilized in the normalization step and combination of the parallel data for the benefit of speed. A challenge encountered in the timing requirements of the computation unit was achieving the correct setup and hold timing for each of the clocks and a maximum frequency of the clocks that is at least the desired run frequency. The Quartus software contains a timing analyzer known as TimeQuest, which will check paths and analyze timing requirements as well as providing statistics on each of the paths. It will also provide some recommendations to help close timing, when possible. With the first inner product unit design, TimeQuest found the faster clock with a maximum rated frequency of 100 MHz less than where it needed to be in order to be triple the 42 speed of the other clock. This issue was the primary motivation for changing the design of the inner product unit. The paths that were failing setup timing were all related to the inner product and the Chip Planner, another tool within Quartus, was used to show the paths that were being taken. In most cases, the path involved an unnecessary stop at a register before passing back into the DSP block. By switching out the adder and multiplier for the multiply-accumulate megafunction, there was a significant decrease in required paths and registers. Therefore, routing was simpler and clocks were not bouncing around nearly as much with fewer registers required. There were a few changes made in order to accommodate the new architecture, but it helped with timing immensely and functionality was verified in MATLAB. The change allowed for the faster clock to have a maximum frequency up to 100 MHz faster than its required speed and the slower clock also has a significant increase in the ceiling for its speed. The setup timing failures were also removed with the removal of the extra registers outside of the DSP blocks. In analyzing the compilation results generated by Quartus, it was found that the software was optimizing out several design-critical signals, including the data inputs which caused much of the subsequent logic to also be optimized out. After issuing a few changes to combat these optimizations, including fixing the parenthesization around signal indices and utilizing the ’noprune’ attribute, new timing errors were uncovered. This is one of the biggest tricks in working with software programs and large projects. There are limitations to sizes of the inputs, and the optimizer will remove seemingly unused signals. If the developer is unaware of these optimizations, they could be placed in a false sense of security. Fortunately, this was discovered and the work done to check if the removals were legitimate. Most of them actually were because of extra space allocated in a signal that ends up never changing or remaining unused. 43 The timing battle continues when the full design is compiled together. Not only is the routing more challenging, new setup timing errors are uncovered because of the routes taken. Though floorplanning was attempted, in most cases it actually prevented the fitter from being able to fit the design. This simply continued the need for timing analysis and tweaks to re-achieve minimal setup and hold errors in each of the clocks. Toolchain Fights Some of the greatest frustration in implementation of the design, was simply in figuring out how to work with and achieve the desired results from the tools utilized. Oftentimes, it was a matter of tweaking settings in the software to display what you want to see (and that is actually occurring), rather than a problem with the hardware code that is being tested. SignalTap Altera provides an internal logic analyzer to watch signals in a design that could not be reached from an external analyzer. This is helpful in debugging a design, however since the logic analyzer uses device resources in the FPGA, anytime a change is made in the analyzer, the design has to be re-compiled. Additionally, the extra resource usage could make it a challenge for large designs. In this case, it is best for modular designs when you can break out a portion to look at without requiring the full design. This was a frequent problem encountered when debugging the part of the project that communicates with the external DRAM because there was no other way to look at the signals, and the particular signals that this project is passing in and out of DRAM are 128 bits wide, so they each take up a lot of resources. The trick, 44 then, is to choose only the signals critically needed to be looked at and minimize the size of the overall project to be scaled up after debugging is completed. TimeQuest Timing Analyzer Altera’s Timequest Timing Analyzer is both a useful tool and a nuisance. It is helpful in predicting timing and showing what the maximum achievable frequency is. However, it requires proper user input to help interpret how data is moving through the design and how different clocks are related. Given the use of a clock that was set to be three times faster than another clock and data that moved freely from one clock’s domain to the other, interpretation for the tool was critical. Without it, the setup and hold analysis had a total failure through the system of hundreds of seconds. The needed input was the correct multicycle paths to tell the tool how to analyze data that crosses clock domains between the system clock and the triple-speed clock. By adding this information, the setup time error went from hundreds of seconds to twenty seconds for the whole project. Further, upon changing the inner product unit, the maximum frequency of the clocks was correctly above where they need to be and the setup time error was completely mitigated. However, upon removal of some of the optimizations that were compiling away needed registers, some of the setup timing errors returned. These occurred mostly in the line scan camera side of the project, as the object is ’built’. It is anticipated that utilizing the clock from the transceiver block that actually corresponds to the line scan data will fix some of those errors. Chip Planner The Chip Planner utility provided with Quartus is extremely useful in visualizing where resources are being used and the proximity of certain resources to each other. It displays where each of the registers, DSP blocks, and I/O are being used for the design after a compilation. It will also show data paths and can be linked to from 45 TimeQuest for viewing critical paths. A useful aspect that helps with timing is the floorplanning feature. As the designer for the project, I was able to group signals together and instruct the fitter to place them co-located. In doing so, the compile time was decreased because paths were found easier and the clock speed increased. This is not always the case, though. When a floorplanning technique from a project containing only the regression step was applied to the project containing the full computation unit, the fitter was unable to fit the design. This is likely due to the significant increase in size of the overall project, so the additional resources and paths prevented the use of the same techniques for fitting as were utilized in the smaller project. Nevertheless, even grouping a small portion of the design together assisted the fitter in finding placement for the whole design sooner and in a more efficient manner than if it were to do it itself without clues as to the grouping. Grouping for floorplans was particularly helpful in this project because of the way the design is implemented. Due to the many generate statements used for working with the parallel channels, it is helpful to the fitter to define what data is moving through each path since as the user, that should be clear. In doing so, the fitter is able to try and place the appropriate signals and data paths in proximity to related paths. A fitted floor plan for the production Arria 10 development board is shown in Figure 4.4. MATLAB The design was verified using the HDL Verifier toolbox available within the MATLAB tool set. In order to use this with the floating point computational blocks, there are a few specific files that need to be included in a particular order to ensure correct compilation with all libraries able to be located. Using this method, in conjunction with ModelSim cosimulation, was useful in verifying the design, but not easy to figure out at first. The HDL verifier is particularly useful in large projects 46 Figure 4.4: A fitted floor plan in the Arria 10. such as this because it will provide inputs to the system and the outputs can be compared with MATLAB calculations for easy verification. However, it was also useful to have Modelsim running the design because it was sometimes easier to follow the data path through each of the signals visually, rather than trying to pull out the right information on the outputs. This was also a way to track internal signals without having to port them out. Verifying with MATLAB is an exercise in making sure that the functionality of the design is fully understood. It has to be programmed in both the MATLAB language as well as the hardware that you are testing. Because it is user code testing user code, it is important that the desired functionality is fully understood and the MATLAB code is believed to be correct. It is often necessary and useful to do a couple of iterations by hand in order to assure oneself of the working nature of the MATLAB program. When this does not happen, the debugging process is infinitely more frustrating. This was experienced in testing the regression system. MATLAB was used to provide the inputs and upon receiving an interrupt, it read the outputs from the registers and wrote them to a file. The results in this file were compared 47 to those found by MATLAB on the same inputs to determine accuracy. At first, it appeared to be working. Upon switching the regression to use an accumulator in the inner product block, verification became somewhat dicey. Though the MATLAB code did not change, the results from the VHDL could not be made to match it, and the VHDL made sense. Upon closer inspection, it turned out that the MATLAB script was calculating the inner product inaccurately and thus, previous results were also inaccurately verified. The updated MATLAB script was verified by hand for a couple pixels to assure users that it was indeed correct. With this change, the VHDL was also verified to be accurate. This blunder provided an important lesson in verification as it would not have been discovered if the inner product unit had not changed. MATLAB was also used to generate the code for the register constants that would be sent over I2C to the clock generator. By modifying a previously existing script, the register definitions could be documented with the defaults and the desired values. For any future changes, the user can simply change the values in the script and re-generate the code. It generates a series of constant definitions to be pasted in the VHDL file that controls the command transmission. The HDL Verifier was used to verify functionality of the I2C driving state machine to ensure that the address, register address, and data are sent and able to be acknowledged correctly. Toolchain Tricks As previously mentioned, it was often the case that timing or optimization errors were the cause of misinterpretation by the tools of the desired design. Many of the changes made included manipulating settings in the software to provide assisted interpretation for the Quartus toolchain. These changes are detailed in this section. In Quartus, attributes are used to assist the tools in interpretation and ensure that particular conditions are kept in contrast to what might be readily perceived. 48 One of these attributes is ’noprune’. This is used to keep the synthesis analyzer from removing a signal from the design. It is declared in the architecture prior to the ’begin’ statement as a boolean. The boolean is then assigned to the appropriate signal and set to ’true’. This was used in the object tracking file for the purpose of ensuring that the tracking array was kept completely in the design. See Appendix B for the object tracking.vhd file and the usage of ’noprune’. An additional resource to assist in design compilations is the Compiler Settings found in the Settings menu of Quartus. Within this section, there are Advanced Settings available for both Synthesis and the Fitter. These settings were used primarily when the design was having troubles fitting in the device. Some of the changes made include, in the Fitter settings, changing the optimization technique to optimize for speed, changing the fitter seed value (a random number, different from the default, was used), setting the optimization mode to ’high performance effort’, and setting the fitter aggressive routability optimization to ’always’. Many of these settings default to ’automatically’ or ’off’ or if a range is possible, the default is the middle option. Changing these settings alerts the tools to the user’s priorities in the design and ensures that the maximum possible effort is placed in fitting the design to the device. The changes made for this design were done to prioritize timing closure regardless of increases in compile time or increased difficulty in fitting, so long as a fit was achieved. Using the Chip Planner to set Logic Lock regions is another useful way to assist in the fitting of the device and optimizing for timing. Setting these regions requires knowledge of the signals or resources that should be included in each region. Incorrectly setting these could cause the fit to fail. Both scenarios were experienced in the development of the computation system and the full system. However, the 49 regions were used to separate out the parallel resources for ease of interpretation by the tools. 50 TEST AND VERIFICATION Camera Interface The interface responsible for taking the data from the camera, combining it with data from the DRAM FIFO, and assigning location information was tested via MATLAB cosimulation and Modelsim. This was done by simulating the data from the camera with memory blocks per tap and assigning location information - verifying that the locations were being assigned correctly. Subsequently, the DRAM interface was added and the steps of writing to the DRAM and pulling from it in addition to combining location information with the incoming camera data was tested and verified using SignalTap. A couple of different scenarios were checked, such as if the location from DRAM does not match the expected location corresponding to the camera data location and the error flag needs to be set and all subsequent data can be ignored until location zero is encountered again. DRAM The interface with the DRAM was primarily verified using the SignalTap Logic Analyzer. An incrementing counter was written to the DRAM and then the same space was read sequentially in a repetitive fashion to ensure that the memory controller is functioning correctly. This was further verified with the use of the buffer on the read side when combined with the camera interface. At the time of this publication, the interaction between the HPS and the DRAM had not yet been verified, though can be done by passing the values read on the FPGA back to the HPS for comparison to the values originally written to the memory. 51 Computation Unit The computation unit was developed and tested in sections. All sections were verified using MATLAB cosimulation. First, each of the components within the regression calculation were developed and tested individually. These are the inner product unit and the normalization block with corresponding test bench scripts inner product tb.m and normalize tb.m that can be found in Appendix C. The functionality of the megafunction which converts the fixed point numbers to floating point was verified with the normalization block. Testing incrementally in this way was also used to assist in the development of the component as a whole as it relies on knowledge of the latency through each block to trigger some signals, such as the signal indicating that a new pixel is beginning in the inner product or a result is ready on the output. The inner product block was tested with the normalization by inputting the values from MATLAB on the input ports and using the known latencies to verify the output before the full unit was tested as it is expected to be used. This means utilizing the Avalon memory mapped interface to read and write registers and accessing the results from memory after triggered by an interrupt. This interrupt was later moved in the full computation unit to be utilized for a different memory block. The full verification of the regression was completed by writing the class coefficients, mean and inverted standard deviation values to memory and piping in the input values after setting the enable bit and interrupt enable bit. Upon completion of the image matrix, the test bench spins on the interrupt until it is set. At this point, it reads from the results memory block. The results read from the system are written to a spreadsheet along with the expected values, as calculated in MATLAB, and compared for accuracy. After satisfactory completion of this test, a full frame of a 52 small image is tested to verify that accurate results are obtained for each line in an image. The test bench file for this verification is regression tb.m (see Appendix C). Other components of the computation unit verified individually in MATLAB include the sorting block and the object classification block. The sorting block was verified by reading the print out of sorted results to visually check that they are sorted, and then checking that the indices line up with the sorted results (see sort tb.m in Appendix C). The Modelsim output was analyzed to verify the expected two clock cycle latency for sorting. This verification was also useful in determining the order in which the elements are sorted, whether from least to greatest or vice versa so as to correctly interpret the results internally to the computation unit. The object classification block was verified by creating a few sample objects in Paint that are simply black on a white background for a clear distinction. The image was read into MATLAB and the resulting data was used as the simulated transmission from the monochrome line scan camera. A small section was used to check that the correct classification results were being compiled over the object and a definitive answer was correctly given at the end of the object (see objects tb.m in Appendix C). Originally developed within the object classification block is a component which converts the monochrome pixel number to the hyperspectral pixel number. This was also verified individually in MATLAB using camera ratios tb.m (see Appendix C) by generating a plot to relate the hyperspectral pixel numbers with the monochrome pixel numbers as shown in Figure 5.1. Figure 5.2 shows a section of the same plot, depicting the nature of the relations. This component was moved to the Arria V side of the system to alleviate the need for an excessive number of transmissions. Due to the nature of having inputs running on several different clocks, testing of the full computation unit from camera interface input to results of object classification has not been performed in simulation, but with each of the components working as 53 Figure 5.1: Generated plot depicting ratios between the pixels of the line scan camera and the pixels of the hyperspectral camera. Figure 5.2: Generated plot depicting ratios between the pixels of the line scan camera and the pixels of the hyperspectral camera, zoomed in for greater detail. expected, the author is confident in the full system functionality. This will be tested further as development continues. 54 FPGA to FPGA Transmission The interface using the transceivers was verified by first sending information between two different transceivers on the same board, before trying to link the two boards together. The packet generator was sending counter values and the checker was looking for counter values independently enabling this same system to be used when transmitting between the two boards. The packet generator and checker systems were provided in a design example from the Altera Wiki [18]. Signal Tap was used to check error signals from the pattern checker and the SerialLite2 core. In verifying the transmission between boards, different transmission speeds were tested, including the maximum rate that the Arria V can support, 6.5536 Gbps. At this rate, there were significant errors in the transmission as bits flipped. The goal was to have at least 6 Gbps and this was achieved with minimal errors at a transmission rate of 6250 Mbps, or 6.250 Gbps. The high-level files relevant for this testing are: • a10 com.vhd – xcvr core.vhd ∗ a10 phy.vhd ∗ sl2 core.vhd ∗ xcvr pll.vhd • packet generate.vhd • packet verify.vhd The xcvr core.vhd file can be found in Appendix B, as can the top-level a10 com.vhd file. The others were either provided by the design example or generated in the Megawizard for use in the project. A similar structure was used for testing on the 55 Arria V and the top-level file for that project (a5 com.vhd) can also be found in Appendix B. 56 CONCLUSION A dynamic and powerful real-time image processing system is being developed on an FPGA for application in sorting systems. The Arria 10 FPGA is utilized for its high speed transceivers in addition to its hardened floating point DSP blocks and hardened memory controllers. Development of the system in VHDL enables the use of generic parameters for possible changes in the camera front-end to the system. In doing so, the system is modular and can be utilized in various spaces. Test and verification of the system has been performed using tools provided by Altera and MathWorks to test individual subsystems as well as various combinations of subsystems. Further development and testing will be required as the hardware is developed and put in place for actual camera interactions with the FPGA. The prototype developed demonstrates the benefit of floating point calculations in an FPGA for real-time processing. Techniques utilized here can be taken for use in a custom-built board on which a single smart camera system can reside. 57 REFERENCES CITED 58 [1] R. Snider, “Unpublished proposal in response to the montana board of research and commercialization technology request for proposals, research and commer- cialization projects, fiscal year 2016 guidelines,” 2015, unpublished. [2] “What is spectral imaging and when should i use it?” White Paper, Resonon. [3] G. Lokman and G. Yilmaz, “Hyperspectral image classification using support vector neural network algorithm,” pp. 239–243, 2015. [4] (2016) Food sorting machines market: Global industry analysis and opportunity assessment 2015-2025. Future Market Insights. 616 Corporate Way, Valley Cottage, NY 10989. [Online]. Available: http://www.futuremarketinsights.com/ reports/food-sorting-machines-market [5] “mvbluegemini technical details,” Matrix Vision GmbH, Talstrasse 16, 71570 Oppenweiler, 2016. [6] “mvbluelynx-x technical details,” Matrix Vision GmbH, Talstrasse 16, 71570 Oppenweiler, 2014. [7] (2016) Matrox iris gt with matrox design assistant 4. [Online]. Available: http: //www.matrox.com/imaging/media/pdf/products/iris gt da/iris gt da.pdf [8] (2015) Razercam: Highspeed smart kamera for machine vision. Eye Vision Technology. 76131 Karlsruhe Germany. [Online]. Available: http://www. evt-web.com/fileadmin/img/products/RazerCam/RazerCam 15 EN V004.pdf [9] “Boa smart vision system,” Teledyne DALSA, 2013. [10] “Camera link technology brief,” Basler Vision Technologies, 2001. [11] (2016) Arria 10 socs: Features. Altera Corporation, now part of Intel. 101 Innovation Drive, San Jose, CA 95134. [Online]. Available: https: //www.altera.com/products/soc/portfolio/arria-10-soc/features.html [12] M. Parker, “Understanding peak floating-point performance claims,” Altera Corporation, June 2014. [13] (2015) Altera’s 30 billion transistor fpga. Gazettabyte. [Online]. Available: http://www.gazettabyte.com/home/2015/6/28/ alteras-30-billion-transistor-fpga.html [14] (2015) Pads. Computer Software. Mentor Graphics. [Online]. Available: https://www.pads.com [15] “Si570/si571 data sheet: 10 mhz to 1.4 ghz i2c programmable xo/vcxo,” Silicon Labs, 400 West Cesar Chavez, Austin, TX 78701, 2014. 59 [16] S. Larson. (2015) EE Wiki. Version 2.2. [Online]. Available: https: //eewiki.net/pages/viewpage.action?pageId=10125324 [17] “Programmable clock generator 5p49v6913 datasheet,” IDT, 6024 Silver Creek Valley Road, San Jose, CA 95138, 2015, revision C. [18] (2015) Using seriallite ii ip on arria 10 devices. Altera Wiki. [Online]. Available: http://www.alterawiki.com/wiki/Using SerialLite II IP on Arria 10 devices 60 APPENDICES 61 APPENDIX A REGISTER DESCRIPTIONS 62 Computation Unit Registers Table A.1: ENABLE Register Description MSB ENABLE (Block Offset = 0x0, Register Offset = 0x0) LSB Bits 31 28 27 24 23 20 19 16 15 12 11 8 7 4 3 0 R/W - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - I Reset 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Table A.2: IRQ ENABLE Register Description MSB IRQ ENABLE (Block Offset = 0x0, Register Offset = 0x4) LSB Bits 31 28 27 24 23 20 19 16 15 12 11 8 7 4 3 0 R/W - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - I Reset 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Table A.3: IRQ PENDING Register Description MSB IRQ PENDING (Block Offset = 0x0, Register Offset = 0x8) LSB Bits 31 28 27 24 23 20 19 16 15 12 11 8 7 4 3 0 R/W - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - I Reset 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 63 Table A.4: NUM BINS Register Description MSB NUM BINS (Block Offset = 0x100, Register Offset = 0x0) LSB Bits 31 28 27 24 23 20 19 16 15 12 11 8 7 4 3 0 R/W - - - - - - - - - - - - - - - - - - - - - - - - I I I I I I I I Reset 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 Table A.5: NUM PIXELS Register Description MSB NUM PIXELS (Block Offset = 0x100, Register Offset = 0x4) LSB Bits 31 28 27 24 23 20 19 16 15 12 11 8 7 4 3 0 R/W - - - - - - - - - - - - - - - - - - - - - - I I I I I I I I I I Reset 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 Table A.6: NUM CLASSES Register Description MSB NUM CLASSES (Block Offset = 0x100, Register Offset = 0x8) LSB Bits 31 28 27 24 23 20 19 16 15 12 11 8 7 4 3 0 R/W - - - - - - - - - - - - - - - - - - - - - - - - - - - I I I I I Reset 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 0 Table A.7: FRAME COUNT Register Description MSB FRAME COUNT (Block Offset = 0x100, Register Offset = 0xC) LSB Bits 31 28 27 24 23 20 19 16 15 12 11 8 7 4 3 0 R/W I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I Reset 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Table A.8: MEAN Register Description MSB MEAN (Block Offset = 0x1000 LSB Bits 31 28 27 24 23 20 19 16 15 12 11 8 7 4 3 0 R/W S E E E E E E E E F F F F F F F F F F F F F F F F F F F F F F F Table A.9: STD DEV I Register Description MSB STD DEV I (Block Offset = 0x4000 LSB Bits 31 28 27 24 23 20 19 16 15 12 11 8 7 4 3 0 R/W S E E E E E E E E F F F F F F F F F F F F F F F F F F F F F F F Table A.10: COEFFICIENT Register Description MSB COEFFICIENT (Block Offset = 0x100000 LSB Bits 31 28 27 24 23 20 19 16 15 12 11 8 7 4 3 0 R/W S E E E E E E E E F F F F F F F F F F F F F F F F F F F F F F F 64 Table A.11: INNER PRODUCT Register Description MSB INNER PRODUCT (Block Offset = 0x200000 LSB Bits 31 28 27 24 23 20 19 16 15 12 11 8 7 4 3 0 R/W S E E E E E E E E F F F F F F F F F F F F F F F F F F F F F F F Table A.12: DECISION VECTOR Register Description MSB DECISION VECTOR (Block Offset = 0x300000 LSB Bits 31 28 27 24 23 20 19 16 15 12 11 8 7 4 3 0 R/W S E E E E E E E E F F F F F F F F F F F F F F F F F F F F F F F 65 APPENDIX B VHDL CODE 66 1 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 2 −− 3 −−! @ f i l e i 2 c d r i v e r . vhd 4 −−! @br ie f Contro l s programming o f c l o c k genera tor over i2c 5 −−! @de t a i l s Contains s t a t e machine f o r programming r e g i s t e r s in 6 −−! VersaClock c l o c k genera tor . 7 −−! @author Monica Whitaker 8 −−! @date September 2015 9 −−! @copyright Copyright (C) 2015 Ross K. Snider and Monica Whitaker 10 −− 11 −− This program i s f r e e so f tware : you can r e d i s t r i b u t e i t and/or modify 12 −− i t under the terms o f the GNU General Pub l i c License as pub l i s h ed by 13 −− the Free Sof tware Foundation , e i t h e r ve r s i on 3 o f the License , or 14 −− ( a t your opt ion ) any l a t e r ve r s i on . 15 −− 16 −− This program i s d i s t r i b u t e d in the hope t ha t i t w i l l be u s e fu l , 17 −− but WITHOUT ANY WARRANTY; wi thout even the imp l i ed warranty o f 18 −− MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 19 −− GNU General Pub l i c License f o r more d e t a i l s . 20 −− 21 −− You shou ld have r e c e i v ed a copy o f the GNU General Pub l i c License 22 −− a long wi th t h i s program . I f not , see . 23 −− 24 −− Monica Whitaker 25 −− E l e c t r i c a l and Computer Engineer ing 26 −− Montana S ta t e Un i v e r s i t y 27 −− 610 Cob le i gh Ha l l 28 −− Bozeman , MT 59717 29 −− monica . whitaker@msu . montana . edu 30 −− 31 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 32 l ibrary IEEE ; −−! Use standard l i b r a r y . 33 use IEEE . STD LOGIC 1164 .ALL; −−! Use standard l o g i c e lements . 34 use IEEE .NUMERIC STD.ALL; −−! Use numeric s tandard . 35 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 36 −− 37 −−! @br ie f i 2 c d r i v e r 67 38 −−! @de t a i l s Contains s t a t e machine f o r programming r e g i s t e r s in 39 −−! VersaClock c l o c k genera tor . 40 −−! @param c l k System c l o c k 41 −−! @param re s e t n Reset s i g n a l 42 −−! @param enab l e Enable s t a r t i n g s t a t e machine 43 −−! @param i 2 c s c l Clock l i n e 44 −−! @param i2c sda bi−d i r e c t i o n a l data l i n e 45 −−! @param error I2C communication error 46 −−! @param done Ind i c a t e s s t a t e machine complete 47 −−! @param burn succes s S ta tus s i g n a l a f t e r burning con f i g u ra t i on 48 −− 49 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 50 entity i 2 c d r i v e r i s 51 port ( 52 c l k : in s t d l o g i c ; 53 r e s e t n : in s t d l o g i c ; 54 enable : in s t d l o g i c ; 55 56 i 2 c s c l : inout s t d l o g i c ; 57 i 2 c s da : inout s t d l o g i c ; 58 59 e r r o r : out s t d l o g i c ; 60 61 done : out s t d l o g i c ; 62 burn succe s s : out s t d l o g i c 63 ) ; 64 end entity ; 65 66 architecture arch of i 2 c d r i v e r i s 67 component i 2 c mas t e r i s 68 GENERIC( 69 i npu t c l k : INTEGER := 50 000 000 ; −−input c l o c k speed (Hz) 70 bus c l k : INTEGER := 400 000 ) ; −−speed o f s c l (Hz) 71 PORT( 72 c l k : IN STD LOGIC; 73 r e s e t n : IN STD LOGIC; 74 ena : IN STD LOGIC; 75 addr : IN STD LOGIC VECTOR(6 DOWNTO 0) ; 76 rw : IN STD LOGIC; 77 data wr : IN STD LOGIC VECTOR(7 DOWNTO 0) ; 78 busy : OUT STD LOGIC; 79 data rd : OUT STD LOGIC VECTOR(7 DOWNTO 0) ; 80 a ck e r r o r : BUFFER STD LOGIC; 81 sda : INOUT STD LOGIC; 68 82 s c l : INOUT STD LOGIC 83 ) ; 84 end component ; 85 86 −−address o f Clock Generator dev i c e 87 −−xD4 (xD5 to read ) 88 constant addres s dev : s t d l o g i c v e c t o r (7 downto 0) := 89 "11010100" ; 90 91 −−CONFIGURATION 0 HAS BEEN BURNED! ! 92 −−CHANGE Burn Reg i s t e r s f o r f u r t h e r burns 93 94 −− Reg00 Name: RAM0 00 95 −− Reg00 Descr ip t i on : OTP Contro l 96 −− Hex Address = 00 97 −− Defau l t = x”FF” 98 constant Reg00 Addr : s t d l o g i c v e c t o r (7 downto 0) := 99 "00000000" ; 100 constant Reg00 Data : s t d l o g i c v e c t o r (7 downto 0) := 101 "01100001" ; 102 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 103 −− Reg01 Name: RAM1 XTAL1 104 −− Reg01 Descr ip t i on : X1 Load Capaci tor 105 −− Hex Address = 12 106 −− Defau l t = 00000001 107 constant Reg01 Addr : s t d l o g i c v e c t o r (7 downto 0) := 108 "00010010" ; 109 constant Reg01 Data : s t d l o g i c v e c t o r (7 downto 0) := 110 "00101001" ; 111 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 112 −− Reg02 Name: RAM1 XTAL2 113 −− Reg02 Descr ip t i on : Factory Reserved 114 −− Hex Address = 13 115 −− Defau l t = 00000000 116 constant Reg02 Addr : s t d l o g i c v e c t o r (7 downto 0) := 117 "00010011" ; 118 constant Reg02 Data : s t d l o g i c v e c t o r (7 downto 0) := 119 "00101000" ; 120 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 121 −− Reg03 Name: RAM1 Feedback 122 −− Reg03 Descr ip t i on : Feedback In t e g e r Div ider (PLL) 123 −− Hex Address = 17 124 −− Defau l t = 00000011 125 constant Reg03 Addr : s t d l o g i c v e c t o r (7 downto 0) := 126 "00010111" ; 127 constant Reg03 Data : s t d l o g i c v e c t o r (7 downto 0) := 128 "00000110" ; 69 129 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 130 −− Reg04 Name: RAM1 Feedback 131 −− Reg04 Descr ip t i on : Feedback In t e g e r Div ider Bi t s (PLL) 132 −− Hex Address = 18 133 −− Defau l t = 00000000 134 constant Reg04 Addr : s t d l o g i c v e c t o r (7 downto 0) := 135 "00011000" ; 136 constant Reg04 Data : s t d l o g i c v e c t o r (7 downto 0) := 137 "01000000" ; 138 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 139 −− Reg05 Name: RAM2 2E 140 −− Reg05 Descr ip t i on : Output Div ider In t e g e r 2 141 −− Hex Address = 2e 142 −− Defau l t = 11100000 143 constant Reg05 Addr : s t d l o g i c v e c t o r (7 downto 0) := 144 "00101110" ; 145 constant Reg05 Data : s t d l o g i c v e c t o r (7 downto 0) := 146 "10100000" ; 147 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 148 −− Reg06 Name: RAM6 60 149 −− Reg06 Descr ip t i on : Clock1 Output Config 150 −− Hex Address = 60 151 −− Defau l t = 10111011 152 constant Reg06 Addr : s t d l o g i c v e c t o r (7 downto 0) := 153 "01100000" ; 154 constant Reg06 Data : s t d l o g i c v e c t o r (7 downto 0) := 155 "01111011" ; 156 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 157 −− Reg07 Name: RAM1 1D 158 −− Reg07 Descr ip t i on : VCO Monitoring 159 −− Hex Address = 1D 160 −− Defau l t = 01101111 161 constant Reg07 Addr : s t d l o g i c v e c t o r (7 downto 0) := 162 "00011101" ; 163 constant Reg07 Data : s t d l o g i c v e c t o r (7 downto 0) := 164 "01001101" ; 165 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 166 −− Reg08 Name: RAM1 1E 167 −− Reg08 Descr ip t i on : RC Contro l Reg i s t e r 168 −− Hex Address = 1E 169 −− Defau l t = 00000000 170 constant Reg08 Addr : s t d l o g i c v e c t o r (7 downto 0) := 171 "00011110" ; 172 constant Reg08 Data : s t d l o g i c v e c t o r (7 downto 0) := 173 "10010010" ; 174 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 175 −− Reg09 Name: RAM1 1F 70 176 −− Reg09 Descr ip t i on : RC Contro l Reg i s t e r 177 −− Hex Address = 1F 178 −− Defau l t = 00110010 179 constant Reg09 Addr : s t d l o g i c v e c t o r (7 downto 0) := 180 "00011111" ; 181 constant Reg09 Data : s t d l o g i c v e c t o r (7 downto 0) := 182 "00110010" ; 183 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 184 −− BURN REG1 Name: User S t a r t Address [ 8 : 0 ] 185 −− Descr ip t i on : Part−Se l e c t Bi t 186 −− Hex Address = 73 187 constant Burn Reg1 Addr : s t d l o g i c v e c t o r (7 downto 0) := 188 "01110011" ; 189 constant Burn Reg1 Data : s t d l o g i c v e c t o r (7 downto 0) := 190 "00000000" ; 191 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 192 −− BURN REG2 Name: CFG0 Test b l o c k enab l e 193 −− Descr ip t i on : Enable Sub−b lock ’ s Test Mode 194 −− Hex Address = 74 195 constant Burn Reg2 Addr : s t d l o g i c v e c t o r (7 downto 0) := 196 "01110100" ; 197 constant Burn Reg2 Data : s t d l o g i c v e c t o r (7 downto 0) := 198 "01001110" ; 199 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 200 −− BURN REG3 Name: User End Address [ 8 : 0 ] 201 −− Descr ip t i on : Part−Se l e c t Bi t 202 −− Hex Address = 75 203 constant Burn Reg3 Addr : s t d l o g i c v e c t o r (7 downto 0) := 204 "01110101" ; 205 constant Burn Reg3 Data : s t d l o g i c v e c t o r (7 downto 0) := 206 "00110100" ; 207 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 208 −− BURN REG4 Name: User End Address 209 −− Descr ip t i on : Part−Se l e c t Bi t 210 −− Hex Address = 76 211 constant Burn Reg4 Addr : s t d l o g i c v e c t o r (7 downto 0) := 212 "01110110" ; 213 constant Burn Reg4 Data : s t d l o g i c v e c t o r (7 downto 0) := 214 "11100001" ; 215 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 216 −− BURN REG5 Name: Burned Reg i s t e r S t a r t Address 217 −− Descr ip t i on : Burned r e g i s t e r s t a r t address 218 −− Hex Address = 77 219 constant Burn Reg5 Addr : s t d l o g i c v e c t o r (7 downto 0) := 220 "01110111" ; 221 constant Burn Reg5 Data : s t d l o g i c v e c t o r (7 downto 0) := 222 "00000000" ; 71 223 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 224 −− BURN REG6 Name: Read Reg i s t e r S t a r t Address 225 −− Descr ip t i on : Read r e g i s t e r s t a r t address 226 −− Hex Address = 78 227 constant Burn Reg6 Addr : s t d l o g i c v e c t o r (7 downto 0) := 228 "01111000" ; 229 constant Burn Reg6 Data : s t d l o g i c v e c t o r (7 downto 0) := 230 "00000000" ; 231 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 232 233 type s t a t e t yp e i s ( next cmd , send cmd ) ; 234 signal s t a t e : s t a t e t yp e ; 235 signal cmd cnt : i n t e g e r range 0 to 16 ;−−35; 236 signal end count , c a l c ount : i n t e g e r range 0 to 5000000; 237 signal burn count : i n t e g e r range 0 to 25000000; 238 239 signal i 2c ena , i2c rw , i2c busy , i 2 c a c k e r r o r , busy prev : s t d l o g i c ; 240 signal i 2c addr , s l ave addr : s t d l o g i c v e c t o r (6 downto 0) ; 241 signal i 2 c da ta rd , i2c data wr , reg addr , reg data , data : 242 s t d l o g i c v e c t o r (7 downto 0) ; 243 signal vco va l : s t d l o g i c v e c t o r (4 downto 0) ; 244 signal rw : s t d l o g i c ; 245 246 begin 247 248 i 2 c i o : i 2 c mas t e r 249 generic map( 250 i npu t c l k => 50 000 000 , 251 bus c l k => 400 000 ) 252 port map( 253 c l k => c lk , 254 r e s e t n => r e s e t n , 255 ena => i 2c ena , 256 addr => i 2c addr , 257 rw => i2c rw , 258 data wr => i 2c data wr , 259 busy => i 2c busy , 260 data rd => i 2 c da ta rd , 261 a ck e r r o r => i 2 c a c k e r r o r , 262 sda => i 2 c sda , 263 s c l => i 2 c s c l 264 ) ; 265 266 process ( c lk , r e s e t n ) 267 variable busy cnt : i n t e g e r range 0 to 2 ; 268 begin 72 269 i f ( r e s e t n = ’0 ’ ) then 270 busy cnt := 0 ; 271 done <= ’0 ’ ; 272 s t a t e <= next cmd ; 273 i 2 c ena <= ’0 ’ ; 274 end count <= 0 ; 275 ca l c ount <= 0 ; 276 cmd cnt <= 0 ; 277 e r r o r <= ’0 ’ ; 278 burn count <= 0 ; 279 e l s i f ( r i s i n g e d g e ( c l k ) ) then 280 i f ( enable = ’1 ’ ) then 281 case s t a t e i s 282 when send cmd => 283 −− l a t c h busy s i g n a l 284 busy prev <= i2c busy ; 285 i f ( busy prev = ’0 ’ and i 2 c busy = ’1 ’ ) then 286 busy cnt := busy cnt + 1 ; 287 end i f ; 288 289 case busy cnt i s 290 when 0 => 291 i 2 c ena <= ’1 ’ ; 292 i 2 c addr <= s lave addr ; 293 −−always wr i t e f i r s t 294 i 2 c rw <= ’0 ’ ; 295 i 2 c da ta wr <= reg addr ; 296 when 1 => 297 i f ( rw = ’1 ’ ) then 298 −− i f reading , do so 299 i 2 c rw <= rw ; 300 else −−otherwise , wr i t e data 301 i 2 c da ta wr <= reg data ; 302 end i f ; 303 when 2 => 304 i 2 c ena <= ’0 ’ ; 305 i f ( i 2 c busy = ’0 ’ ) then 306 −−c o l l e c t data read 307 data <= i2 c da t a rd ; 308 busy cnt := 0 ; 309 s t a t e <= next cmd ; 310 end i f ; 311 end case ; 312 313 when next cmd => 314 −−Burn process has known p o s s i b i l i t y 73 315 −−o f NAK 316 −− i f ( i 2 c a c k e r r o r = ’1 ’ and cmd cnt /= 0) 317 −− then 318 −− −−s t a t e <= send cmd ; 319 −− error <= ’1 ’ ; 320 −− e l s e 321 case cmd cnt i s 322 when 0 => 323 s l ave addr <= address dev (7 downto 1) ; 324 rw <= ’0 ’ ; −−wr i t e 325 reg addr <= Reg01 Addr ; 326 r eg data <= Reg01 Data ; 327 cmd cnt <= 1 ; 328 s t a t e <= send cmd ; 329 when 1 => 330 reg addr <= Reg02 Addr ; 331 r eg data <= Reg02 Data ; 332 cmd cnt <= 2 ; 333 s t a t e <= send cmd ; 334 when 2 => 335 reg addr <= Reg03 Addr ; 336 r eg data <= Reg03 Data ; 337 cmd cnt <= 3 ; 338 s t a t e <= send cmd ; 339 when 3 => 340 reg addr <= Reg04 Addr ; 341 r eg data <= Reg04 Data ; 342 cmd cnt <= 4 ; 343 s t a t e <= send cmd ; 344 when 4 => 345 reg addr <= Reg05 Addr ; 346 r eg data <= Reg05 Data ; 347 cmd cnt <= 5 ; 348 s t a t e <= send cmd ; 349 when 5 => 350 reg addr <= Reg06 Addr ; 351 r eg data <= Reg06 Data ; 352 cmd cnt <= 6 ; 353 s t a t e <= send cmd ; 354 when 6 => 355 reg addr <= Reg07 Addr ; 356 r eg data <= Reg07 Data ; 357 cmd cnt <= 7 ; 358 s t a t e <= send cmd ; 359 when 7 => 360 reg addr <= Reg08 Addr ; 74 361 r eg data <= Reg08 Data ; 362 cmd cnt <= 8 ; 363 s t a t e <= send cmd ; 364 when 8 => 365 reg addr <= Reg09 Addr ; 366 r eg data <= Reg09 Data ; 367 cmd cnt <= 9 ; 368 s t a t e <= send cmd ; 369 when 9 => −−Begin VCO ca l i b r a t i o n 370 reg addr <= x"11" ; 371 r eg data <= "00001100" ; 372 cmd cnt <= 10 ; 373 s t a t e <= send cmd ; 374 when 10 => −−t o g g l e b i t 7 375 reg addr <= x"1C" ; 376 r eg data <= "00000101" ; 377 rw <= ’0 ’ ; 378 cmd cnt <= 11 ; 379 s t a t e <= send cmd ; 380 when 11 => 381 reg addr <= x"1C" ; 382 r eg data <= "10000101" ; 383 cmd cnt <= 12 ; 384 s t a t e <= send cmd ; 385 when 12 => 386 reg addr <= x"1C" ; 387 r eg data <= "00000101" ; 388 cmd cnt <= 13 ; 389 s t a t e <= send cmd ; 390 when 13 => −−wai t 100ms 391 i f ( ca l c ount = 5000000) then 392 ca l c ount <= 0 ; 393 reg addr <= x"99" ; 394 rw <= ’1 ’ ; −− read 395 cmd cnt <= 14 ; 396 s t a t e <= send cmd ; 397 else 398 ca l c ount <= ca l count + 1 ; 399 end i f ; 400 when 14 => 401 −−s t o r e data read from r e g i s t e r 402 vco va l <= data (7 downto 3) ; 403 cmd cnt <= 15 ; 404 when 15 => 75 405 i f ( unsigned ( vco va l ) /= 406 to uns igned (23 ,5 ) and 407 unsigned ( vco va l ) /= 408 to uns igned (0 , 5 ) ) then 409 −−f o r c e VCO va lue 410 reg addr <= x"11" ; 411 rw <= ’0 ’ ; 412 r eg data <= "001" & vco va l ; 413 cmd cnt <= 16 ; 414 s t a t e <= send cmd ; 415 else 416 cmd cnt <= 10 ; −−repea t c a l i b r a t i o n 417 end i f ; 418 −−ONLY used f o r burning con f i gura t i on−− 419 −− when 16 => 420 −− reg addr <= Reg00 Addr ; 421 −− r e g da ta <= Reg00 Data ; 422 −− cmd cnt <= 17; 423 −− s t a t e <= send cmd ; 424 −− when 17 => −−s e t up burn r e g i s t e r s 425 −− reg addr <= Burn Reg1 Addr ; 426 −− r e g da ta <= Burn Reg1 Data ; 427 −− cmd cnt <= 18; 428 −− s t a t e <= send cmd ; 429 −− when 18 => 430 −− reg addr <= Burn Reg2 Addr ; 431 −− r e g da ta <= Burn Reg2 Data ; 432 −− cmd cnt <= 19; 433 −− s t a t e <= send cmd ; 434 −− when 19 => 435 −− reg addr <= Burn Reg3 Addr ; 436 −− r e g da ta <= Burn Reg3 Data ; 437 −− cmd cnt <= 20; 438 −− s t a t e <= send cmd ; 439 −− when 20 => 440 −− reg addr <= Burn Reg4 Addr ; 76 441 −− r e g da ta <= Burn Reg4 Data ; 442 −− cmd cnt <= 21; 443 −− s t a t e <= send cmd ; 444 −− when 21 => 445 −− reg addr <= Burn Reg5 Addr ; 446 −− r e g da ta <= Burn Reg5 Data ; 447 −− cmd cnt <= 22; 448 −− s t a t e <= send cmd ; 449 −− when 22 => 450 −− reg addr <= Burn Reg6 Addr ; 451 −− r e g da ta <= Burn Reg6 Data ; 452 −− cmd cnt <= 23; 453 −− s t a t e <= send cmd ; 454 −− when 23 => 455 −− −−wai t 100ms 456 −− i f ( end count = 5000000) then 457 −− cmd cnt <= 24; 458 −− e l s e 459 −− end count <= end count + 1; 460 −− end i f ; 461 −− when 24 => −−s t a r t burn proces s 462 −− reg addr <= x ”72”; 463 −− r e g da ta <= x”F0”; 464 −− cmd cnt <= 25; 465 −− s t a t e <= send cmd ; 466 −− when 25 => 467 −− reg addr <= x ”72”; 468 −− r e g da ta <= x”F8”; 469 −− cmd cnt <= 26; 470 −− s t a t e <= send cmd ; 471 −− when 26 => 472 −− −−wai t 500ms 473 −− i f ( burn count = 25000000) then 474 −− cmd cnt <= 27; 475 −− burn count <= 0; 476 −− e l s e 477 −− burn count <= burn count + 1; 478 −− end i f ; 77 479 −− when 27 => 480 −− reg addr <= x ”72”; 481 −− r e g da ta <= x”F0”; 482 −− cmd cnt <= 28; 483 −− s t a t e <= send cmd ; 484 −− when 28 => 485 −− reg addr <= x ”72”; 486 −− r e g da ta <= x”F8”; 487 −− cmd cnt <= 29; 488 −− s t a t e <= send cmd ; 489 −− when 29 => 490 −− −−wai t 500ms 491 −− i f ( burn count = 25000000) then 492 −− reg addr <= x ”72”; 493 −− r e g da ta <= x”F0”; 494 −− s t a t e <= send cmd ; 495 −− cmd cnt <= 30; 496 −− e l s e 497 −− burn count <= burn count + 1; 498 −− end i f ; 499 −− when 30 => −−margin read 500 −− reg addr <= x ”72”; 501 −− r e g da ta <= x”F2”; 502 −− cmd cnt <= 31; 503 −− s t a t e <= send cmd ; 504 −− when 31 => 505 −− reg addr <= x ”72”; 506 −− r e g da ta <= x”F0”; 507 −− cmd cnt <= 32; 508 −− s t a t e <= send cmd ; 509 −− when 32 => −− t e s t i f s u c c e s s f u l 510 −− reg addr <= x”9F”; 511 −− rw <= ’1 ’ ; 512 −− cmd cnt <= 33; 513 −− s t a t e <= send cmd ; 514 −− when 33 => 515 −− i f ( data (1) = ’1 ’ ) then 516 −− error <= ’1 ’ ; 517 −− e l s e 518 −− burn succes s <= ’1 ’ ; 519 −− end i f ; 520 −− cmd cnt <= 34; 521 −− when 34 => −−r e s e t r e g i s t e r 522 −− reg addr <= x”9F”; 523 −− r e g da ta <= x ”00”; 78 524 −− rw <= ’0 ’ ; 525 −− cmd cnt <= 35; 526 −− s t a t e <= send cmd ; 527 when 16 =>−−35 => 528 done <= ’1 ’ ; 529 end case ; 530 −− end i f ; 531 end case ; 532 end i f ; 533 end i f ; 534 end process ; 535 end architecture ; 1 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 2 −− 3 −−! @ f i l e r e g r e s s i on . vhd 4 −−! @br ie f The l o g i s t i c r e g r e s s i on computation un i t . 5 −−! @de t a i l s Computes the s t a t i s t i c a l p r o b a b i l i t y o f a p i x e l 6 −−! b e l ong ing to a p a r t i c u l a r c l a s s g i ven s e v e r a l 7 −−! d i f f e r e n t c l a s s c o e f f i c i e n t s and normal ized p i x e l 8 −−! data . 9 −−! @author Monica Whitaker 10 −−! @date September 2015 11 −−! @copyright Copyright (C) 2015 Ross K. Snider and 12 −−! Monica Whitaker 13 −− 14 −− This program i s f r e e so f tware : you can r e d i s t r i b u t e i t and/or 15 −− modify i t under the terms o f the GNU General Pub l i c License 16 −− as pub l i s h ed by the Free Sof tware Foundation , e i t h e r ve r s i on 17 −− 3 o f the License , or ( at your opt ion ) any l a t e r ve r s i on . 18 −− 19 −− This program i s d i s t r i b u t e d in the hope t ha t i t w i l l be 20 −− use fu l , but WITHOUT ANY WARRANTY; wi thout even the imp l i ed 21 −− warranty o f MERCHANTABILITY or FITNESS FOR A PARTICULAR 22 −− PURPOSE. See the GNU General Pub l i c License f o r more d e t a i l s . 23 −− 24 −− You shou ld have r e c e i v ed a copy o f the GNU General Pub l i c 25 −− License a long wi th t h i s program . I f not , see . 26 −− 27 −− Monica Whitaker 28 −− E l e c t r i c a l and Computer Engineer ing 29 −− Montana S ta t e Un i v e r s i t y 30 −− 610 Cob le i gh Ha l l 31 −− Bozeman , MT 59717 32 −− monica . whitaker@msu . montana . edu 33 −− 79 34 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 35 l ibrary IEEE ; −−! Use standard l i b r a r y . 36 use IEEE . STD LOGIC 1164 .ALL; −−! Use standard l o g i c e lements . 37 use IEEE .NUMERIC STD.ALL; −−! Use numeric s tandard . 38 use IEEE .MATHREAL.ALL; −−! Use r e a l math l i b r a r y 39 40 use work . Sensor Package . a l l ; −−! Pro jec t cons tan t s package 41 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 42 −− 43 −−! @br ie f r e g r e s s i on 44 −−! @de t a i l s Computes the s t a t i s t i c a l p r o b a b i l i t y o f a p i x e l 45 −−! b e l ong ing to a p a r t i c u l a r c l a s s g i ven s e v e r a l 46 −−! d i f f e r e n t c l a s s c o e f f i c i e n t s and normal ized p i x e l 47 −−! data . 48 −−! @param TOTAL INPUT SIZE Si ze o f a l l p a r a l l e l p i x e l data . 49 −−! @param WORD SIZE Standard word s i z e 50 −−! @param i n p u t c l k P i x e l c l o c k 51 −−! @param enab l e i n Enable s i g n a l from HPS 52 −−! @param s u p e r p i x e l i n Vector o f a l l r e l e v an t p i x e l in format ion f o r each p a r a l l e l channel 53 −−! @param p i x e l r e s u l t s o u t Vector o f p r o b a b i l i t i e s and p i x e l number 54 −−! @param p i x e l r e s u l t s f l a g o u t Flag i n d i c a t i n g new r e s u l t s on output 55 −−! @param f r ame f l a g ou t Flag to i n d i c a t e new frame 56 −−! @param f a s t c l k Clock running at t r i p l e the speed o f 57 −−! the input c l o c k 58 −−! @param hp s c l k Clock f o r s i g n a l s from HPS 59 −−! @param r s t n System ac t i v e−low r e s e t s i g n a l 60 −−! @param da t a v a l i d i n Ind i c a t e s new data presen t on 61 −−! s u p e r p i x e l i n 62 −−! @param c l e a r p i x e l i n Ind i c a t e s bad p i x e l and t r i g g e r to c l e a r the cu r r en t l y p roce s s ing p i x e l when high 63 −−! @param avs s1 r ead Read r e que s t from HPS 64 −−! @param av s s 1 w r i t e Write r e que s t from HPS 65 −−! @param av s s1 add r e s s Data address from HPS 66 −−! @param avs s1 r eadda ta Output data f o r HPS 67 −−! @param av s s 1 wr i t e d a t a Input data from HPS 68 −− 69 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 70 entity r e g r e s s i o n i s 80 71 generic ( 72 TOTAL INPUT SIZE : natura l := 73 NUMBEROF PARALLEL CHANNELS ∗ SUPER PIXEL SIZE ; 74 WORD SIZE : natura l := 32 75 ) ; 76 port ( 77 i npu t c l k : in s t d l o g i c ; 78 enab l e i n : in s t d l o g i c ; 79 s u p e r p i x e l i n : in s t d l o g i c v e c t o r ( TOTAL INPUT SIZE − 1 downto 0) ; 80 p i x e l r e s u l t s o u t : out s t d l o g i c v e c t o r ( NUMBER OF CLASSES∗WORD SIZE+PIXEL ADDRESS SIZE − 1 downto 0) ; 81 p i x e l r e s u l t s f l a g o u t : out s t d l o g i c ; 82 f a s t c l k : in s t d l o g i c ; 83 hps c l k : in s t d l o g i c ; 84 hp s r e s e t : in s t d l o g i c ; 85 r s t n : in s t d l o g i c ; 86 da t a v a l i d i n : in s t d l o g i c ; 87 c l e a r p i x e l i n : in s t d l o g i c ; 88 89 av s s 1 r ead : in s t d l o g i c ; 90 av s s 1 w r i t e : in s t d l o g i c ; 91 av s s 1 add r e s s : in s t d l o g i c v e c t o r (31 downto 0) ; 92 avs s1 r eaddata : out s t d l o g i c v e c t o r (31 downto 0) ; 93 av s s 1 wr i t eda ta : in s t d l o g i c v e c t o r (31 downto 0) 94 ) ; 95 end entity ; 96 97 architecture r t l of r e g r e s s i o n i s 98 99 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 100 −− Component De f i n i t i o n s 101 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 102 component normal ize i s −−15 c y c l e l a t ency 103 port ( 104 c l k : in s t d l o g i c ; 105 r s t n : in s t d l o g i c ; 106 da t a v a l i d i n : in s t d l o g i c ; 107 data in : in s t d l o g i c v e c t o r (31 downto 0) ; 108 dark in : in s t d l o g i c v e c t o r (31 downto 0) ; 109 l i g h t I i n : in s t d l o g i c v e c t o r (31 downto 0) ; 110 mean in : in s t d l o g i c v e c t o r (31 downto 0) ; 111 s t ddev I i n : in s t d l o g i c v e c t o r (31 downto 0) ; 81 112 normal i zed out : out s t d l o g i c v e c t o r (31 downto 0) 113 ) ; 114 end component normal ize ; 115 116 component f p mu l t acc i s −−4 c y c l e s 117 port ( 118 a : in s t d l o g i c v e c t o r (31 downto 0) := 119 ( others => ’ 0 ’ ) ; 120 acc : in s t d l o g i c := ’ 0 ’ ; 121 a r e s e t : in s t d l o g i c := ’ 0 ’ ; 122 b : in s t d l o g i c v e c t o r (31 downto 0) := 123 ( others => ’ 0 ’ ) ; 124 c l k : in s t d l o g i c := ’ 0 ’ ; 125 q : out s t d l o g i c v e c t o r (31 downto 0) 126 ) ; 127 end component ; 128 129 component memory block i s 130 generic ( 131 num elements a : natura l ; 132 num elements b : natura l ; 133 s i z e a dd r e s s a : natura l ; 134 s i z e a dd r e s s b : natura l ; 135 s i z e word a : natura l ; 136 s i z e word b : natura l ; 137 mem init : s t r i n g := "UNUSED" 138 ) ; 139 port ( 140 addre s s a : in s t d l o g i c v e c t o r ( s i z e add r e s s a −1 downto 0) ; 141 addres s b : in s t d l o g i c v e c t o r ( s i z e add r e s s b −1 downto 0) ; 142 c l o ck a : in s t d l o g i c := ’ 1 ’ ; 143 c l o ck b : in s t d l o g i c := ’ 1 ’ ; 144 data a : in s t d l o g i c v e c t o r ( s i z e word a−1 downto 0) ; 145 data b : in s t d l o g i c v e c t o r ( s i ze word b−1 downto 0) ; 146 wren a : in s t d l o g i c := ’ 0 ’ ; 147 wren b : in s t d l o g i c := ’ 0 ’ ; 148 q a : out s t d l o g i c v e c t o r ( s i z e word a−1 downto 0) ; 149 q b : out s t d l o g i c v e c t o r ( s i ze word b−1 downto 0) 150 ) ; 151 end component memory block ; 152 82 153 component f i x e d t o f l o a t i s −−2 c y c l e s 154 port ( 155 a : in s t d l o g i c v e c t o r (15 downto 0) := 156 ( others => ’ 0 ’ ) ; 157 a r e s e t : in s t d l o g i c := ’ 0 ’ ; 158 c l k : in s t d l o g i c := ’ 0 ’ ; 159 q : out s t d l o g i c v e c t o r (31 downto 0) 160 ) ; 161 end component f i x e d t o f l o a t ; 162 163 component channel sum i s 164 generic ( 165 WORD SIZE : natura l := 32 166 ) ; 167 port ( 168 c l k : in s t d l o g i c ; 169 f a s t c l k : in s t d l o g i c ; 170 r s t n : in s t d l o g i c ; 171 i n t e r c e p t i n : in s t d l o g i c v e c t o r (WORD SIZE−1 downto 0) ; 172 data in : in s t d l o g i c v e c t o r ( 173 NUMBEROF PARALLEL CHANNELS∗ WORD SIZE−1 downto 0) ; 174 r e s u l t o u t : out s t d l o g i c v e c t o r (WORD SIZE−1 downto 0) 175 ) ; 176 end component ; 177 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 178 −− Constant De f i n i t i o n s 179 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 180 −−v a l i d va l u e s = 1 , 2 , 4 , 8 , 16 181 constant PSEUDO PARALLEL CHANNELS : natura l := 8 ; 182 183 constant MEMORYWORDSHPS : natura l := 184 (NUMBER OF SPECTRAL BINS/ NUMBEROF PARALLEL CHANNELS) ∗ 185 PSEUDO PARALLEL CHANNELS; 186 constant HPS MEM ADDR SIZE : natura l := natura l ( l og2 ( r e a l (MEMORYWORDSHPS) ) ) ; 187 188 constant CONVERSION LEVELS : natura l := 3 ; 189 constant NORMALIZE LEVELS : natura l := 15 ; 190 constant PRODUCT LEVELS : natura l := 4 ; 191 −−3 c y c l e s per add 192 constant COMBINATION LEVELS : natura l := 2∗( NUMBEROF PARALLEL CHANNELS) ; 83 193 constant NUMBER LEVELS : natura l := CONVERSION LEVELS + NORMALIZE LEVELS + PRODUCT LEVELS + COMBINATION LEVELS + 2 ; 194 195 constant ZEROS : s t d l o g i c v e c t o r (31 downto 0) := x"00000000" ; 196 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 197 −− Type De f i n i t i o n s 198 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 199 type wr i t e a r r ay i s array (1 to NUMBER OF CLASSES) of s t d l o g i c ; 200 type word array i s array (1 to NUMBER OF CLASSES) of 201 s t d l o g i c v e c t o r (WORD SIZE − 1 downto 0) ; 202 type c l a s s a r r a y i s array (1 to NUMBER OF CLASSES) of 203 s t d l o g i c v e c t o r (WORD SIZE∗ 204 PSEUDO PARALLEL CHANNELS − 1 downto 0) ; 205 206 type row array i s array (1 to NUMBEROF PARALLEL CHANNELS) of s t d l o g i c v e c t o r ( SPECTRAL BIN ADDRESS SIZE − 1 downto 0) ; 207 type column array i s array (1 to NUMBEROF PARALLEL CHANNELS) of s t d l o g i c v e c t o r ( PIXEL ADDRESS SIZE − 1 downto 0) ; 208 type i n da t a a r r ay i s array (1 to NUMBEROF PARALLEL CHANNELS) of s t d l o g i c v e c t o r (DATA SIZE − 1 downto 0) ; 209 type data ar ray i s array (1 to NUMBEROF PARALLEL CHANNELS) of s t d l o g i c v e c t o r (WORD SIZE − 1 downto 0) ; 210 type p a r t i a l s a r r a y i s array (1 to NUMBER OF CLASSES) of data ar ray ; 211 type product ar ray i s array (1 to NUMBER OF CLASSES) of s t d l o g i c v e c t o r (NUMBEROF PARALLEL CHANNELS∗ WORD SIZE − 1 downto 0) ; 212 213 type prod array i s array (1 to NUMBEROF PARALLEL CHANNELS) of s t d l o g i c ; 214 type p rod s i g a r r ay i s array (1 to NUMBER OF CLASSES) of prod array ; 215 216 type d a t a l e v e l s a r r a y i s array (1 to NUMBER LEVELS) of data ar ray ; 217 type b in s a r r ay i s array (1 to NUMBER LEVELS) of row array ; 84 218 type p i x e l s a r r a y i s array (1 to NUMBER LEVELS) of column array ; 219 type l o g i c a r r a y i s array (1 to NUMBER LEVELS) of s t d l o g i c ; 220 type mem addr array i s array (1 to NUMBER LEVELS) of s t d l o g i c v e c t o r ( natura l ( trunc ( log2 ( r e a l ( NUMBER OF SPECTRAL BINS / NUMBEROF PARALLEL CHANNELS) ) ) )−1 downto 0) ; 221 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 222 −− S igna l De f i n i t i o n s 223 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 224 signal da ta c l k : s t d l o g i c ; 225 signal r e s e t : s t d l o g i c ; 226 signal mean write : s t d l o g i c ; 227 signal s tddev I wr i t e : s t d l o g i c ; 228 229 signal va l id , p i x e l e r r : l o g i c a r r a y ; 230 231 signal row0 : row array ; 232 signal column0 : column array ; 233 signal bin : b i n s a r r ay ; 234 signal p i x e l : p i x e l s a r r a y ; 235 signal mem address : mem addr array ; 236 signal da t a f l o a t : data ar ray ; 237 signal normal : data ar ray ; 238 signal data : i n da t a a r r ay ; 239 signal l i g h t I : d a t a l e v e l s a r r a y ; 240 signal dark : d a t a l e v e l s a r r a y ; 241 signal c l a s s , c l a s s s i g : c l a s s a r r a y ; 242 243 signal i n t e r c e p t s : word array ; 244 signal r e s u l t s : word array ; 245 signal r e su l t s t emp : word array ; 246 signal r e a d c l a s s : word array ; 247 signal r e ad i n t e r c e p t : word array ; 248 signal r e a d r e s u l t : word array ; 249 signal z e r o a r r ay : word array := 250 ( others => x"00000000" ) ; 251 252 signal p a r t i a l : p a r t i a l s a r r a y ; 253 254 signal f i n a l p a r t i a l : p roduct ar ray ; 255 256 signal c l a s s w r i t e : w r i t e a r r ay ; 257 signal i n t e r c e p t w r i t e : w r i t e a r r ay ; 258 signal r e s u l t w r i t e : w r i t e a r r ay ; 259 85 260 signal acc : p r od s i g a r r ay ; 261 262 signal mean : s t d l o g i c v e c t o r ( PSEUDO PARALLEL CHANNELS∗ WORD SIZE − 1 downto 0) ; 263 signal s tddevI : s t d l o g i c v e c t o r ( PSEUDO PARALLEL CHANNELS∗ WORD SIZE − 1 downto 0) ; 264 265 signal c l a s s add r : s t d l o g i c v e c t o r (HPS MEM ADDR SIZE −1 downto 0) ; 266 signal mean addr : s t d l o g i c v e c t o r (HPS MEM ADDR SIZE −1 downto 0) ; 267 signal stddev addr : s t d l o g i c v e c t o r (HPS MEM ADDR SIZE −1 downto 0) ; 268 269 signal read mean : s t d l o g i c v e c t o r (WORD SIZE − 1 downto 0) ; 270 signal r ead s tddev I : s t d l o g i c v e c t o r (WORD SIZE − 1 downto 0) ; 271 272 begin 273 274 −− l a s t b in o f p i x e l has f i n i s h e d proce s s ing 275 p i x e l r e s u l t s f l a g o u t <= ’1 ’ when ( bin (NUMBER LEVELS) ( NUMBEROF PARALLEL CHANNELS) = s t d l o g i c v e c t o r ( to uns igned (NUMBER OF SPECTRAL BINS−1, SPECTRAL BIN ADDRESS SIZE) ) ) else ’ 0 ’ ; 276 277 i memory block means : memory block −−fpga on b , hps on a 278 generic map( 279 num elements a => MEMORYWORDSHPS, 280 num elements b => NUMBER OF SPECTRAL BINS / 281 NUMBER OF PARALLEL CHANNELS, 282 s i z e a dd r e s s a => HPS MEM ADDR SIZE, 283 s i z e a dd r e s s b => natura l ( trunc ( log2 ( r e a l ( NUMBER OF SPECTRAL BINS / 284 NUMBEROF PARALLEL CHANNELS) ) ) ) , 285 s i z e word a => WORD SIZE, 286 s i z e word b => PSEUDO PARALLEL CHANNELS ∗ WORD SIZE, 287 mem init => "means.mif" 288 ) 289 port map( 290 addre s s a => mean addr , 291 addres s b => mem address (CONVERSION LEVELS − 1) , 292 c l o ck a => hps c lk , 293 c l o ck b => data c lk , 294 data a => avs s1 wr i t eda ta , 295 data b => ( others => ’ 0 ’ ) , 86 296 wren a => mean write , 297 wren b => ’ 0 ’ , 298 q a => read mean , 299 q b => mean 300 ) ; 301 302 mean addr <= avs s 1 add r e s s (HPS MEM ADDR SIZE−1 downto 0) when av s s 1 add r e s s (10) = ’ 1 ’ ; 303 304 i memory block stddevs : memory block −−read on b , wr i t e on a 305 generic map( 306 num elements a => MEMORYWORDSHPS, 307 num elements b => NUMBER OF SPECTRAL BINS / 308 NUMBER OF PARALLEL CHANNELS, 309 s i z e a dd r e s s a => HPS MEM ADDR SIZE, 310 s i z e a dd r e s s b => natura l ( trunc ( log2 ( r e a l ( NUMBER OF SPECTRAL BINS / 311 NUMBEROF PARALLEL CHANNELS) ) ) ) , 312 s i z e word a => WORD SIZE, 313 s i z e word b => PSEUDO PARALLEL CHANNELS ∗ WORD SIZE, 314 mem init => "stddevs.mif" 315 ) 316 port map( 317 addre s s a => stddev addr , 318 addres s b => mem address (CONVERSION LEVELS − 1) , 319 c l o ck a => hps c lk , 320 c l o ck b => data c lk , 321 data a => avs s1 wr i t eda ta , 322 data b => ( others => ’ 0 ’ ) , 323 wren a => s tddev I wr i t e , 324 wren b => ’ 0 ’ , 325 q a => read stddevI , 326 q b => s tddevI 327 ) ; 328 329 stddev addr <= avs s 1 add r e s s (HPS MEM ADDR SIZE−1 downto 0) when av s s 1 add r e s s (12) = ’ 1 ’ ; 330 331 g normal i ze : for j in 1 to NUMBEROF PARALLEL CHANNELS generate 332 333 i f i x e d t o f l o a t : f i x e d t o f l o a t 334 port map( 335 a => data ( j ) , 336 a r e s e t => r e s e t , 337 c l k => data c lk , 338 q => da t a f l o a t ( j ) 87 339 ) ; 340 341 −−normal ize by l i g h t , dark , mean , s tddev 342 i n o rma l i z e : normal ize 343 port map( 344 c l k => data c lk , 345 r s t n => r s t n , 346 da t a v a l i d i n => va l i d (CONVERSION LEVELS) , 347 data in => da t a f l o a t ( j ) , 348 dark in => dark (CONVERSION LEVELS) ( j ) , 349 l i g h t I i n => l i g h t I (CONVERSION LEVELS) ( j ) , 350 mean in => mean(WORD SIZE∗ j−1 downto WORD SIZE∗( j−1) ) , 351 s t ddev I i n => s tddevI (WORD SIZE∗ j−1 downto WORD SIZE∗( j−1) ) , 352 normal i zed out => normal ( j ) 353 ) ; 354 355 end generate ; 356 357 c l a s s add r <= s t d l o g i c v e c t o r ( unsigned ( av s s 1 add r e s s ( 358 HPS MEM ADDR SIZE−1 downto 0) ) − 1) ; 359 360 g c l a s s i f y : for i in 1 to NUMBER OF CLASSES generate 361 362 i memory b l o ck in t e r c ep t s : memory block 363 generic map( 364 num elements a => 1 , 365 num elements b => 1 , 366 s i z e a dd r e s s a => 1 , 367 s i z e a dd r e s s b => 1 , 368 s i z e word a => WORD SIZE, 369 s i z e word b => WORD SIZE, 370 mem init => "UNUSED" 371 ) 372 port map( 373 addre s s a => "0" , 374 addres s b => "0" , 375 c l o ck a => data c lk , 376 c l o ck b => hps c lk , 377 data a => ( others => ’ 0 ’ ) , 378 data b => avs s1 wr i t eda ta , 379 wren a => ’ 0 ’ , 380 wren b => i n t e r c e p t w r i t e ( i ) , 381 q a => i n t e r c e p t s ( i ) , 382 q b => r e ad i n t e r c e p t ( i ) 383 ) ; 88 384 385 i memory b l o ck c l a s s e s : memory block −−FPGA on b , HPS on a 386 generic map( 387 num elements a => MEMORYWORDSHPS, 388 num elements b => NUMBER OF SPECTRAL BINS / 389 NUMBER OF PARALLEL CHANNELS, 390 s i z e a dd r e s s a => HPS MEM ADDR SIZE, 391 s i z e a dd r e s s b => natura l ( trunc ( log2 ( r e a l ( 392 NUMBER OF SPECTRAL BINS / 393 NUMBEROF PARALLEL CHANNELS) ) ) ) , 394 s i z e word a => WORD SIZE, 395 s i z e word b => (PSEUDO PARALLEL CHANNELS ∗ 396 WORD SIZE) , 397 mem init => "UNUSED" 398 ) 399 port map( 400 addre s s a => c l a s s addr , 401 addres s b => mem address (NORMALIZE LEVELS+ CONVERSION LEVELS−1) , 402 c l o ck a => hps c lk , 403 c l o ck b => data c lk , 404 data a => avs s1 wr i t eda ta , 405 data b => ( others => ’ 0 ’ ) , 406 wren a => c l a s s w r i t e ( i ) , 407 wren b => ’ 0 ’ , 408 q a => r e a d c l a s s ( i ) , 409 q b => c l a s s ( i ) 410 ) ; 411 412 −−used in inner product c a l c u l a t i o n 413 c l a s s s i g ( i ) <= c l a s s ( i ) when va l i d (NORMALIZE LEVELS + CONVERSION LEVELS)= ’1 ’ else ( others=> ’0 ’) ; 414 415 −−Refer to r e g i s t e r d e s c r i p t i o n document 416 i n t e r c e p t w r i t e ( i ) <= av s s 1 w r i t e when 417 t o i n t e g e r ( unsigned ( av s s 1 add r e s s (31 downto 18) ) ) = 1 and t o i n t e g e r ( unsigned ( av s s 1 add r e s s (17 downto HPS MEM ADDR SIZE) ) ) = i and av s s 1 add r e s s (HPS MEM ADDR SIZE−1 downto 0) = ZEROS( HPS MEM ADDR SIZE−1 downto 0) else ’ 0 ’ ; 418 419 c l a s s w r i t e ( i ) <= av s s 1 w r i t e when 420 t o i n t e g e r ( unsigned ( av s s 1 add r e s s (31 downto 18) ) ) = 1 and t o i n t e g e r ( unsigned ( av s s 1 add r e s s (17 downto HPS MEM ADDR SIZE) ) ) = i and av s s 1 add r e s s ( 89 HPS MEM ADDR SIZE−1 downto 0) /= ZEROS( HPS MEM ADDR SIZE−1 downto 0) else ’ 0 ’ ; 421 422 r e s u l t w r i t e ( i ) <= ’1 ’ when ( bin (NUMBER LEVELS) ( NUMBEROF PARALLEL CHANNELS) = s t d l o g i c v e c t o r ( to uns igned (NUMBER OF SPECTRAL BINS−1, SPECTRAL BIN ADDRESS SIZE) ) ) OR p i x e l e r r ( NUMBER LEVELS) = ’1 ’ else ’ 0 ’ ; 423 424 −−p i x e l r e s u l t s o u t => 425 p i x e l r e s u l t s o u t (NUMBER OF CLASSES∗WORD SIZE+ PIXEL ADDRESS SIZE−1 downto NUMBER OF CLASSES∗ WORD SIZE) <= p i x e l (NUMBER LEVELS) (1 ) ; 426 427 p i x e l r e s u l t s o u t (WORD SIZE∗ i−1 downto WORD SIZE∗( i −1) ) <= r e s u l t s ( i ) ; 428 429 i memory b l o ck r e su l t s : memory block −−FPGA on a , HPS on b 430 generic map( 431 num elements a => NUMBER OF PIXELS, 432 num elements b => NUMBER OF PIXELS, 433 s i z e a dd r e s s a => PIXEL ADDRESS SIZE , 434 s i z e a dd r e s s b => PIXEL ADDRESS SIZE , 435 s i z e word a => WORD SIZE, 436 s i z e word b => WORD SIZE, 437 mem init => "UNUSED" 438 ) 439 port map( 440 addre s s a => p i x e l (NUMBER LEVELS) (1 ) , 441 addres s b => av s s 1 add r e s s (PIXEL ADDRESS SIZE−1 downto 0) , 442 c l o ck a => data c lk , 443 c l o ck b => hps c lk , 444 data a => r e s u l t s ( i ) , 445 data b => ( others => ’ 0 ’ ) , 446 wren a => r e s u l t w r i t e ( i ) , 447 wren b => ’ 0 ’ , 448 q a => open , 449 q b => r e a d r e s u l t ( i ) 450 ) ; 451 452 453 −−add p i x e l r e s u l t s across p a r a l l e l channe l s 454 i channel sum sum : channel sum 455 generic map( 456 WORD SIZE => WORD SIZE 90 457 ) 458 port map( 459 c l k => data c lk , 460 f a s t c l k => f a s t c l k , 461 r s t n => r s t n , 462 i n t e r c e p t i n => i n t e r c e p t s ( i ) , 463 data in => f i n a l p a r t i a l ( i ) , 464 r e s u l t o u t => r e su l t s t emp ( i ) 465 ) ; 466 467 r e s u l t l o c k : process ( data c lk , r s t n ) 468 begin 469 i f ( r s t n = ’0 ’ ) then 470 r e s u l t s ( i ) <= ze ro a r r ay ( i ) ; 471 e l s i f ( r i s i n g e d g e ( da ta c l k ) ) then 472 i f ( p i x e l e r r (NUMBER LEVELS−1) = ’0 ’ ) then 473 r e s u l t s ( i ) <= re su l t s t emp ( i ) ; 474 else 475 r e s u l t s ( i ) <= ze ro a r r ay ( i ) ; 476 end i f ; 477 end i f ; 478 end process ; 479 480 g product : for j in 1 to NUMBEROF PARALLEL CHANNELS generate 481 482 −−do not accumulate when : beg inn ing o f p i x e l ( f i r s t 5 b in s ) 483 acc ( i ) ( j ) <= ’0 ’ when ( j = 1 and bin (NORMALIZE LEVELS +CONVERSION LEVELS) (1 ) = ZEROS( SPECTRAL BIN ADDRESS SIZE − 1 downto 0) and va l i d ( NORMALIZE LEVELS + CONVERSION LEVELS) = ’1 ’ ) or ( j /= 1 and bin (NORMALIZE LEVELS + CONVERSION LEVELS ) ( j ) = s t d l o g i c v e c t o r ( to uns igned ( j − 1 , SPECTRAL BIN ADDRESS SIZE) ) and va l i d ( NORMALIZE LEVELS + CONVERSION LEVELS) = ’1 ’) else ’ 1 ’ ; 484 485 f i n a l p a r t i a l ( i ) (WORD SIZE∗( NUMBEROF PARALLEL CHANNELS − ( j − 1) ) − 1 downto WORD SIZE∗(NUMBEROF PARALLEL CHANNELS − j ) ) <= pa r t i a l ( i ) ( j ) ; 486 487 i f p mu l t a c c : fp mul t acc 488 port map( 489 a => normal ( j ) , 490 acc => acc ( i ) ( j ) , 91 491 a r e s e t => r e s e t , 492 b => c l a s s s i g ( i ) (WORD SIZE∗ j−1 downto WORD SIZE∗( j−1) ) , 493 c l k => data c lk , 494 q => p a r t i a l ( i ) ( j ) 495 ) ; 496 497 end generate ; 498 499 end generate ; 500 501 r e s e t <= not r s t n ; 502 da ta c l k <= inpu t c l k when enab l e i n = ’1 ’ else ’ 0 ’ ; 503 504 −−s epara t e l o c a t i o n in format ion from input data 505 ba s e l o c a t i o n : for k in 1 to NUMBEROF PARALLEL CHANNELS generate 506 row0 (k ) <= sup e r p i x e l i n ( (TOTAL INPUT SIZE − 507 (NUMBER OF PARALLEL CHANNELS−k ) ∗ SUPER PIXEL SIZE−1) 508 downto (TOTAL INPUT SIZE− 509 (NUMBER OF PARALLEL CHANNELS−k ) ∗ SUPER PIXEL SIZE− 510 SPECTRAL BIN ADDRESS SIZE) ) ; 511 column0 (k ) <= sup e r p i x e l i n ( (TOTAL INPUT SIZE− 512 (NUMBER OF PARALLEL CHANNELS−k ) ∗ SUPER PIXEL SIZE− 513 SPECTRAL BIN ADDRESS SIZE−1) downto 514 (TOTAL INPUT SIZE−( NUMBER OF PARALLEL CHANNELS−k ) ∗ 515 SUPER PIXEL SIZE− SPECTRAL BIN ADDRESS SIZE− 516 PIXEL ADDRESS SIZE) ) ; 517 end generate ; 518 519 520 −− Address Map 521 −− 1000 − beg inn ing o f mean 522 −− 4000 − beg inn ing o f s tddev 523 −− 100000 − beg inn ing o f c l a s s c o e f f i c i e n t s 524 read mux : process ( hps c lk , hp s r e s e t ) 525 begin 526 i f ( hp s r e s e t = ’1 ’ ) then 527 avs s1 r eaddata <= ( others => ’ 0 ’ ) ; 528 mean write <= ’0 ’ ; 529 s tddev I wr i t e <= ’0 ’ ; 530 e l s i f ( r i s i n g e d g e ( hps c l k ) ) then 92 531 i f ( av s s 1 r ead = ’1 ’ ) then 532 mean write <= ’0 ’ ; 533 s tddev I wr i t e <= ’0 ’ ; 534 i f ( t o i n t e g e r ( unsigned ( av s s 1 add r e s s (31 downto 10) ) ) = 1) then 535 avs s1 r eaddata <= read mean ; 536 e l s i f ( t o i n t e g e r ( unsigned ( av s s 1 add r e s s (31 downto 10) ) ) = 4) then 537 avs s1 r eaddata <= read s tddev I ; 538 e l s i f ( t o i n t e g e r ( unsigned ( av s s 1 add r e s s (31 downto 18) ) ) = 1) then 539 i f ( av s s 1 add r e s s (HPS MEM ADDR SIZE−1 downto 0) = ZEROS( 540 HPS MEM ADDR SIZE−1 downto 0) ) then 541 avs s1 r eaddata <= s t d l o g i c v e c t o r ( r e ad i n t e r c e p t ( 542 t o i n t e g e r ( unsigned ( av s s 1 add r e s s ( 543 17 downto HPS MEM ADDR SIZE) ) ) ) ) ; 544 else 545 avs s1 r eaddata <= s t d l o g i c v e c t o r ( r e a d c l a s s ( 546 t o i n t e g e r ( unsigned ( av s s 1 add r e s s (17 547 downto HPS MEM ADDR SIZE) ) ) ) ) ; 548 end i f ; 549 e l s i f ( av s s 1 add r e s s (19) = ’1 ’ ) then 550 avs s1 r eaddata <= s t d l o g i c v e c t o r ( r e a d r e s u l t ( 551 t o i n t e g e r ( unsigned ( av s s 1 add r e s s (18 552 downto PIXEL ADDRESS SIZE) ) ) ) ) ; 553 else 554 avs s1 r eaddata <= ( others => ’ 0 ’ ) ; 555 end i f ; 556 e l s i f ( a v s s 1 w r i t e = ’1 ’ ) then 557 avs s1 r eaddata <= ( others => ’ 0 ’ ) ; 558 i f ( t o i n t e g e r ( unsigned ( av s s 1 add r e s s (31 downto 10) ) ) = 1) then 559 mean write <= ’1 ’ ; 560 s tddev I wr i t e <= ’0 ’ ; 561 e l s i f ( t o i n t e g e r ( unsigned ( av s s 1 add r e s s (31 downto 10) ) ) = 4) then 93 562 s tddev I wr i t e <= ’1 ’ ; 563 mean write <= ’0 ’ ; 564 else 565 mean write <= ’0 ’ ; 566 s tddev I wr i t e <= ’0 ’ ; 567 end i f ; 568 else 569 avs s1 r eaddata <= ( others => ’ 0 ’ ) ; 570 mean write <= ’0 ’ ; 571 s tddev I wr i t e <= ’0 ’ ; 572 end i f ; 573 end i f ; 574 end process ; 575 576 −−p i p e l i n e f o r data in format ion 577 data proc : process ( data c lk , r s t n ) 578 begin 579 i f ( r s t n = ’0 ’ ) then 580 for k in 1 to NUMBER LEVELS loop 581 bin (k ) <= ( others => ( others => ’0 ’) ) ; 582 p i x e l ( k ) <= ( others => ( others => ’0 ’) ) ; 583 end loop ; 584 e l s i f ( r i s i n g e d g e ( da ta c l k ) ) then 585 586 for k in 1 to NUMBEROF PARALLEL CHANNELS loop 587 −−l ock−in input data 588 data (k ) <= sup e r p i x e l i n ( ( TOTAL INPUT SIZE− 589 (NUMBER OF PARALLEL CHANNELS−k ) ∗ 590 SUPER PIXEL SIZE− SPECTRAL BIN ADDRESS SIZE− 591 PIXEL ADDRESS SIZE−1) downto 592 (TOTAL INPUT SIZE−( 593 NUMBER OF PARALLEL CHANNELS−k ) ∗ 594 SUPER PIXEL SIZE− DATA PACKAGE SIZE) ) ; 595 596 l i g h t I (1 ) ( k ) <= sup e r p i x e l i n ( ( TOTAL INPUT SIZE− 597 (NUMBER OF PARALLEL CHANNELS−k ) ∗ 598 SUPER PIXEL SIZE− DATA PACKAGE SIZE−1) downto 599 (TOTAL INPUT SIZE−( 600 NUMBER OF PARALLEL CHANNELS−k ) ∗ 601 SUPER PIXEL SIZE− DATA PACKAGE SIZE− 602 LIGHT CORRECT SIZE) ) ; 94 603 604 dark (1 ) ( k ) <= sup e r p i x e l i n ( ( TOTAL INPUT SIZE− 605 (NUMBER OF PARALLEL CHANNELS−k ) ∗ 606 SUPER PIXEL SIZE− DATA PACKAGE SIZE− 607 LIGHT CORRECT SIZE−1) downto 608 (TOTAL INPUT SIZE−( 609 NUMBER OF PARALLEL CHANNELS−k ) ∗ 610 SUPER PIXEL SIZE− DATA PACKAGE SIZE− 611 LIGHT CORRECT SIZE− DARK CORRECT SIZE) ) ; 612 end loop ; 613 614 for k in 1 to NUMBER LEVELS loop 615 i f ( k = 1) then 616 va l i d ( k ) <= da t a v a l i d i n ; 617 p i x e l e r r ( k ) <= c l e a r p i x e l i n ; 618 bin (k ) <= row0 ; 619 p i x e l ( k ) <= column0 ; 620 i f ( d a t a v a l i d i n = ’1 ’ ) then 621 i f ( row0 (k ) = ZEROS( SPECTRAL BIN ADDRESS SIZE − 1 downto 0) ) then 622 mem address ( k ) <= ZEROS( natura l ( trunc ( log2 ( r e a l (NUMBER OF SPECTRAL BINS / NUMBEROF PARALLEL CHANNELS) ) ) ) − 1 downto 0) ; 623 else −−only increment address wi th each v a l i d input 624 mem address ( k ) <= s t d l o g i c v e c t o r ( unsigned (mem address ( k ) ) + 1) ; 625 end i f ; 626 end i f ; 627 else 628 va l i d ( k ) <= va l i d (k−1) ; 629 p i x e l e r r ( k ) <= p i x e l e r r (k−1) ; 630 bin (k ) <= bin (k−1) ; 631 p i x e l ( k ) <= p i x e l (k−1) ; 632 mem address ( k ) <= mem address (k−1) ; 633 l i g h t I ( k ) <= l i g h t I (k−1) ; 634 dark (k ) <= dark (k−1) ; 635 end i f ; 636 end loop ; 637 end i f ; 638 end process ; 95 639 end architecture ; 1 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 2 −− 3 −−! @ f i l e normal ize . vhd 4 −−! @br ie f Implements norma l i za t ion o f p i x e l data 5 −−! @de t a i l s U t i l i z e s mu l t i p l i c a t i o n and su b t r a c t i on 6 −−! megafunct ions to normal ize incoming f l o a t i n g 7 −−! po in t data va l u e s 8 −−! @author Monica Whitaker 9 −−! @date August 2016 10 −−! @copyright Copyright (C) 2016 Ross K. Snider and 11 −−! Monica Whitaker 12 −− 13 −− This program i s f r e e so f tware : you can r e d i s t r i b u t e i t and/or 14 −− modify i t under the terms o f the GNU General Pub l i c License 15 −− as pub l i s h ed by the Free Sof tware Foundation , e i t h e r ve r s i on 16 −− 3 o f the License , or ( at your opt ion ) any l a t e r ve r s i on . 17 −− 18 −− This program i s d i s t r i b u t e d in the hope t ha t i t w i l l be 19 −− use fu l , but WITHOUT ANY WARRANTY; wi thout even the imp l i ed 20 −− warranty o f MERCHANTABILITY or FITNESS FOR A PARTICULAR 21 −− PURPOSE. See the GNU General Pub l i c License f o r more d e t a i l s . 22 −− 23 −− You shou ld have r e c e i v ed a copy o f the GNU General Pub l i c 24 −− License a long wi th t h i s program . I f not , see . 25 −− 26 −− Monica Whitaker 27 −− E l e c t r i c a l and Computer Engineer ing 28 −− Montana S ta t e Un i v e r s i t y 29 −− 610 Cob le i gh Ha l l 30 −− Bozeman , MT 59717 31 −− monica . whitaker@msu . montana . edu 32 −− 33 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 34 l ibrary IEEE ; −−! Use standard l i b r a r y . 35 use IEEE . STD LOGIC 1164 .ALL ; −−! Use standard l o g i c e lements . 36 use IEEE .NUMERIC STD.ALL ; −−! Use numeric s tandard . 37 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 38 −− 39 −−! @br ie f normal ize 40 −−! @de t a i l s U t i l i z e s mu l t i p l i c a t i o n and su b t r a c t i on 41 −−! megafunct ions to normal ize incoming f l o a t i n g 42 −−! po in t data va l u e s 43 −−! @param c l k Input c l k 44 −−! @param r s t n Act ive low r e s e t 96 45 −−! @param da t a v a l i d i n Enable s i g n a l f o r v a l i d input 46 −−! @param da ta in P i x e l data va lue 47 −−! @param dark in Dark co r r e c t i on va lue 48 −−! @param l i g h t I i n Inve r t ed l i g h t c o r r e c t i on va lue 49 −−! @param mean in Mean va lue 50 −−! @param s t d d e v I i n Inve r t ed standard d e v i a t i on va lue 51 −−! @param norma l i zed out Normalized p i x e l data 52 −− 53 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 54 entity normal ize i s 55 port ( 56 c l k : in s t d l o g i c ; 57 r s t n : in s t d l o g i c ; 58 da t a v a l i d i n : in s t d l o g i c ; 59 data in : in s t d l o g i c v e c t o r (31 downto 0) ; 60 dark in : in s t d l o g i c v e c t o r (31 downto 0) ; 61 l i g h t I i n : in s t d l o g i c v e c t o r (31 downto 0) ; 62 mean in : in s t d l o g i c v e c t o r (31 downto 0) ; 63 s t ddev I i n : in s t d l o g i c v e c t o r (31 downto 0) ; 64 normal i zed out : out s t d l o g i c v e c t o r (31 downto 0) 65 ) ; 66 end entity normal ize ; 67 68 architecture r t l of normal ize i s 69 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 70 −− Component De f i n i t i o n s 71 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 72 component f p f un c s ub t r a c t i s −−3 cyc 73 port ( 74 a : in s t d l o g i c v e c t o r (31 downto 0) := 75 ( others => ’ 0 ’ ) ; 76 a r e s e t : in s t d l o g i c := ’ 0 ’ ; 77 b : in s t d l o g i c v e c t o r (31 downto 0) := 78 ( others => ’ 0 ’ ) ; 79 c l k : in s t d l o g i c := ’ 0 ’ ; 80 q : out s t d l o g i c v e c t o r (31 downto 0) 81 ) ; 82 end component f p f un c s ub t r a c t ; 83 84 component f p func mul t i s −−3 cyc 85 port ( 86 a : in s t d l o g i c v e c t o r (31 downto 0) := 87 ( others => ’ 0 ’ ) ; 88 a r e s e t : in s t d l o g i c := ’ 0 ’ ; 89 b : in s t d l o g i c v e c t o r (31 downto 0) := 90 ( others => ’ 0 ’ ) ; 91 c l k : in s t d l o g i c := ’ 0 ’ ; 97 92 q : out s t d l o g i c v e c t o r (31 downto 0) 93 ) ; 94 end component f p func mul t ; 95 96 component gte compare i s 97 port ( 98 a : in s t d l o g i c v e c t o r (31 downto 0) := 99 ( others => ’ 0 ’ ) ; 100 a r e s e t : in s t d l o g i c := ’ 0 ’ ; 101 b : in s t d l o g i c v e c t o r (31 downto 0) := 102 ( others => ’ 0 ’ ) ; 103 c l k : in s t d l o g i c := ’ 0 ’ ; 104 q : out s t d l o g i c v e c t o r (0 downto 0) 105 ) ; 106 end component gte compare ; 107 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 108 −− Constant De f i n i t i o n s 109 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 110 constant NUMBER LEVELS : natura l := 15 ; 111 constant ZEROS : s t d l o g i c v e c t o r (31 downto 0) := 112 ( others => ’ 0 ’ ) ; 113 constant ONE : s t d l o g i c v e c t o r (31 downto 0) := 114 x"3F800000" ; 115 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 116 −− Type De f i n i t i o n s 117 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 118 type va l i d a r r a y i s array (1 to NUMBER LEVELS) of s t d l o g i c ; 119 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 120 −− S igna l De f i n i t i o n s 121 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 122 signal da ta va l i d : v a l i d a r r a y ; 123 124 signal l i g h t I 1 , l i g h t I 2 , l i g h t I 3 , l i g h t I 4 , l i g h t I 5 : 125 s t d l o g i c v e c t o r (31 downto 0) ; 126 signal mean1 , mean2 , mean3 , mean4 , mean5 , mean6 , mean7 , mean8 : 127 s t d l o g i c v e c t o r (31 downto 0) ; 128 signal stdDev1 , stdDev2 , stdDev3 , stdDev4 , stdDev5 , stdDev6 , stdDev7 , stdDev8 , stdDev9 , stdDev10 , stdDev11 : s t d l o g i c v e c t o r (31 downto 0) ; 129 signal d i f f t emp : s t d l o g i c v e c t o r (31 downto 0) ; 130 signal d i f f : s t d l o g i c v e c t o r (31 downto 0) ; 131 signal corrected temp : s t d l o g i c v e c t o r (31 downto 0) ; 132 signal co r r e c t ed : s t d l o g i c v e c t o r (31 downto 0) ; 133 signal normalized temp : s t d l o g i c v e c t o r (31 downto 0) ; 134 signal normal ized : s t d l o g i c v e c t o r (31 downto 0) ; 135 signal r e s u l t : s t d l o g i c v e c t o r (0 downto 0) ; 136 98 137 signal r e s e t : s t d l o g i c ; 138 139 begin 140 141 r e s e t <= not r s t n ; 142 −−Use Dark and Ligh t to normal ize between 0 and 1 143 dark sub : f p f un c s ub t r a c t 144 port map( 145 a => data in , 146 a r e s e t => r e s e t , 147 b => dark in , 148 c l k => c lk , 149 q => d i f f t emp 150 ) ; 151 152 l i g h t mu l t : fp func mul t 153 port map( 154 a => d i f f , 155 a r e s e t => r e s e t , 156 b => l i g h t I 4 , 157 c l k => c lk , 158 q => corrected temp 159 ) ; 160 161 correct compare : gte compare 162 port map( 163 a => corrected temp , 164 a r e s e t => r e s e t , 165 b => ONE, 166 c l k => c lk , 167 q => r e s u l t 168 ) ; 169 170 mean sub : f p f un c s ub t r a c t 171 port map( 172 a => cor rec ted , 173 a r e s e t => r e s e t , 174 b => mean8 , 175 c l k => c lk , 176 q => normalized temp 177 ) ; 178 179 stddev mult : fp func mul t 180 port map( 181 a => normalized temp , 182 a r e s e t => r e s e t , 183 b => stdDev11 , 99 184 c l k => c lk , 185 q => normal ized 186 ) ; 187 188 proc : process ( c lk , r s t n ) 189 begin 190 i f ( r s t n = ’0 ’ ) then 191 normal i zed out <= ZEROS; 192 e l s i f ( r i s i n g e d g e ( c l k ) ) then 193 −−p i p e l i n e va l u e s 194 l i g h t I 1 <= l i g h t I i n ; 195 l i g h t I 2 <= l i g h t I 1 ; 196 l i g h t I 3 <= l i g h t I 2 ; 197 l i g h t I 4 <= l i g h t I 3 ; 198 199 mean1 <= mean in ; 200 mean2 <= mean1 ; 201 mean3 <= mean2 ; 202 mean4 <= mean3 ; 203 mean5 <= mean4 ; 204 mean6 <= mean5 ; 205 mean7 <= mean6 ; 206 mean8 <= mean7 ; 207 208 stdDev1 <= stddev I i n ; 209 stdDev2 <= stdDev1 ; 210 stdDev3 <= stdDev2 ; 211 stdDev4 <= stdDev3 ; 212 stdDev5 <= stdDev4 ; 213 stdDev6 <= stdDev5 ; 214 stdDev7 <= stdDev6 ; 215 stdDev8 <= stdDev7 ; 216 stdDev9 <= stdDev8 ; 217 stdDev10 <= stdDev9 ; 218 stdDev11 <= stdDev10 ; 219 220 −−p i p e l i n e v a l i d s i g n a l 221 for k in 1 to NUMBER LEVELS loop 222 i f ( k = 1) then 223 da ta va l i d ( k ) <= da t a v a l i d i n ; 224 else 225 da ta va l i d ( k ) <= data va l i d (k−1) ; 226 end i f ; 227 end loop ; 228 229 −− Check f o r nega t i v e va l u e s 230 i f ( d i f f t emp (31) = ’1 ’ ) then 100 231 d i f f <= ( others => ’ 0 ’ ) ; 232 else 233 d i f f <= di f f t emp ; 234 end i f ; 235 236 i f ( r e s u l t = "1" ) then −− cor r ec t ed i s >= 1 237 co r r e c t ed <= ONE; 238 else 239 co r r e c t ed <= corrected temp ; 240 end i f ; 241 242 −−not new data , keep output at one to pre se rve inner product 243 i f ( da t a va l i d (NUMBER LEVELS−1) = ’1 ’ ) then 244 normal i zed out <= normal ized ; 245 else 246 normal i zed out <= ONE; 247 end i f ; 248 249 end i f ; 250 end process ; 251 end architecture ; 1 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 2 −− 3 −−! @ f i l e channel sum . vhd 4 −−! @br ie f Adds t o g e t h e r p a r a l l e l i npu t s 5 −−! @de t a i l s Compiles sum over number o f p a r a l l e l channe l s and 6 −−! adds in the 0 th c l a s s i f i c a t i o n c o e f f i c i e n t 7 −−! @author Monica Whitaker 8 −−! @date August 2016 9 −−! @copyright Copyright (C) 2016 Ross K. Snider and 10 −−! Monica Whitaker 11 −− 12 −− This program i s f r e e so f tware : you can r e d i s t r i b u t e i t and/or 13 −− modify i t under the terms o f the GNU General Pub l i c License 14 −− as pub l i s h ed by the Free Sof tware Foundation , e i t h e r ve r s i on 15 −− 3 o f the License , or ( at your opt ion ) any l a t e r ve r s i on . 16 −− 17 −− This program i s d i s t r i b u t e d in the hope t ha t i t w i l l be 18 −− use fu l , but WITHOUT ANY WARRANTY; wi thout even the imp l i ed 19 −− warranty o f MERCHANTABILITY or FITNESS FOR A PARTICULAR 20 −− PURPOSE. See the GNU General Pub l i c License f o r more d e t a i l s . 21 −− 22 −− You shou ld have r e c e i v ed a copy o f the GNU General Pub l i c 23 −− License a long wi th t h i s program . I f not , see . 101 24 −− 25 −− Monica Whitaker 26 −− E l e c t r i c a l and Computer Engineer ing 27 −− Montana S ta t e Un i v e r s i t y 28 −− 610 Cob le i gh Ha l l 29 −− Bozeman , MT 59717 30 −− monica . whitaker@msu . montana . edu 31 −− 32 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 33 l ibrary IEEE ; −−! Use standard l i b r a r y . 34 use IEEE . STD LOGIC 1164 .ALL ; −−! Use standard l o g i c e lements . 35 use IEEE .NUMERIC STD.ALL ; −−! Use numeric s tandard . 36 37 use work . Sensor Package . a l l ; −−! Pro jec t cons tan t s package f i l e 38 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 39 −− 40 −−! @br ie f channel sum 41 −−! @de t a i l s Compiles sum over number o f p a r a l l e l channe l s and 42 −−! adds in the 0 th c l a s s i f i c a t i o n c o e f f i c i e n t 43 −−! @param WORD SIZE Standard s i z e o f f l o a t i n g po in t data 44 −−! @param c l k Input c l k f o r data ra t e 45 −−! @param f a s t c l k Input c l o c k running at t r i p l e 46 −−! the speed o f the c l k 47 −−! @param r s t n Act ive low r e s e t 48 −−! @param i n t e r c e p t i n 0 th c l a s s i f i c a t i o n c o e f f i c i e n t 49 −−! @param da ta in Vector o f p r o b a b i l i t i e s 50 −−! @param de c i s i o n v e c t o r Sum of a l l p r o b a b i l i t i e s in 51 −−! da t a in and i n t e r c e p t i n 52 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 53 entity channel sum i s 54 generic ( 55 WORD SIZE : natura l := 32 56 ) ; 57 port ( 58 c l k : in s t d l o g i c ; 59 f a s t c l k : in s t d l o g i c ; 60 r s t n : in s t d l o g i c ; 61 i n t e r c e p t i n : in s t d l o g i c v e c t o r (WORD SIZE−1 downto 0) ; 62 data in : in s t d l o g i c v e c t o r ( NUMBEROF PARALLEL CHANNELS∗ WORD SIZE−1 downto 0) ; 63 r e s u l t o u t : out s t d l o g i c v e c t o r (WORD SIZE−1 downto 0) 64 ) ; 65 end entity ; 102 66 67 architecture r t l of channel sum i s 68 69 component fp func add i s −−3 c y c l e l a t ency 70 port ( 71 a : in s t d l o g i c v e c t o r (31 downto 0) := 72 ( others => ’ 0 ’ ) ; 73 a r e s e t : in s t d l o g i c := ’ 0 ’ ; 74 b : in s t d l o g i c v e c t o r (31 downto 0) := 75 ( others => ’ 0 ’ ) ; 76 c l k : in s t d l o g i c := ’ 0 ’ ; 77 q : out s t d l o g i c v e c t o r (31 downto 0) 78 ) ; 79 end component fp func add ; 80 81 constant adde r l a t ency : natura l := 2 ; 82 constant comb ina t i on l e v e l s : natura l := NUMBEROF PARALLEL CHANNELS∗ adde r l a t ency ; 83 84 type data ar ray i s array (1 to comb ina t i on l e v e l s ) of 85 s t d l o g i c v e c t o r (NUMBEROF PARALLEL CHANNELS∗WORD SIZE−1 downto 0) ; 86 type answer array i s array (1 to NUMBEROF PARALLEL CHANNELS) of s t d l o g i c v e c t o r (WORD SIZE−1 downto 0) ; 87 88 signal data de lay : data ar ray ; 89 signal output : answer array := ( others =>(others => ’ 0 ’ ) ) ; 90 signal t emp re su l t s : answer array ; 91 signal r e s e t : s t d l o g i c ; 92 93 begin 94 95 r e s e t <= not r s t n ; 96 97 g adder : for j in 1 to NUMBEROF PARALLEL CHANNELS generate 98 99 i add fp func add : fp func add 100 port map( a => t emp re su l t s ( j ) , 101 a r e s e t => r e s e t , 102 b => data de lay ( adder l a t ency ∗( j−1)+1) 103 (NUMBEROF PARALLEL CHANNELS∗ WORD SIZE−(WORD SIZE∗( j−1) )−1 downto NUMBEROF PARALLEL CHANNELS ∗WORD SIZE−WORD SIZE∗ j ) , 104 c l k => f a s t c l k , 105 q => output ( j ) 103 106 ) ; 107 108 end generate ; 109 110 p i p e l i n e : process ( c lk , r s t n ) 111 begin 112 i f ( r s t n = ’0 ’ ) then 113 r e s u l t o u t <= ( others => ’ 0 ’ ) ; 114 e l s i f ( r i s i n g e d g e ( c l k ) ) then 115 for k in 1 to comb ina t i on l e v e l s loop 116 i f ( k = 1) then 117 data de lay (k ) <= data in ; 118 else 119 data de lay (k ) <= data de lay (k−1) ; 120 end i f ; 121 end loop ; 122 for j in 1 to NUMBEROF PARALLEL CHANNELS loop 123 i f ( j = 1) then 124 t emp re su l t s ( j ) <= in t e r c e p t i n ; 125 else 126 t emp re su l t s ( j ) <= output ( j−1) ; 127 end i f ; 128 end loop ; 129 r e s u l t o u t <= output (NUMBEROF PARALLEL CHANNELS) ; 130 end i f ; 131 end process ; 132 end architecture ; 1 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 2 −− 3 −−! @ f i l e s o r t . vhd 4 −−! @br ie f Sor t s p a r a l l e l i npu t s in descending order 5 −−! @de t a i l s Sor t s input in two c l o c k c y c l e s and ou tpu t s 6 −−! s o r t ed index numbers in add i t i on to so r t ed 7 −−! r e s u l t s 8 −−! @author Monica Whitaker 9 −−! @date August 2016 10 −−! @copyright Copyright (C) 2016 Ross K. Snider and 11 −−! Monica Whitaker 12 −− 13 −− This program i s f r e e so f tware : you can r e d i s t r i b u t e i t and/or 14 −− modify i t under the terms o f the GNU General Pub l i c License 15 −− as pub l i s h ed by the Free Sof tware Foundation , e i t h e r ve r s i on 16 −− 3 o f the License , or ( at your opt ion ) any l a t e r ve r s i on . 17 −− 18 −− This program i s d i s t r i b u t e d in the hope t ha t i t w i l l be 19 −− use fu l , but WITHOUT ANY WARRANTY; wi thout even the imp l i ed 104 20 −− warranty o f MERCHANTABILITY or FITNESS FOR A PARTICULAR 21 −− PURPOSE. See the GNU General Pub l i c License f o r more d e t a i l s . 22 −− 23 −− You shou ld have r e c e i v ed a copy o f the GNU General Pub l i c 24 −− License a long wi th t h i s program . I f not , see . 25 −− 26 −− Monica Whitaker 27 −− E l e c t r i c a l and Computer Engineer ing 28 −− Montana S ta t e Un i v e r s i t y 29 −− 610 Cob le i gh Ha l l 30 −− Bozeman , MT 59717 31 −− monica . whitaker@msu . montana . edu 32 −− 33 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 34 l ibrary IEEE ; −−! Use standard l i b r a r y . 35 use IEEE . STD LOGIC 1164 .ALL; −−! Use standard l o g i c e lements . 36 use IEEE .NUMERIC STD.ALL; −−! Use numeric s tandard . 37 use IEEE .MATHREAL.ALL; −−! Use r e a l math l i b r a r y 38 39 use work . Sensor Package . a l l ; −−! Pro jec t cons tan t s package f i l e 40 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 41 −− 42 −−! @br ie f s o r t 43 −−! @de t a i l s Sor t s input in two c l o c k c y c l e s and ou tpu t s 44 −−! s o r t ed index numbers in add i t i on to so r t ed 45 −−! r e s u l t s 46 −−! @param WORD SIZE Standard s i z e o f f l o a t i n g po in t data 47 −−! @param c l k Input c l k f o r data ra t e 48 −−! @param r s t n Act ive low r e s e t 49 −−! @param u l i s t i n Unsorted vec t o r o f va l u e s 50 −−! @param s l i s t o u t Sorted vec t o r o f va l u e s 51 −−! @param s l i s t i n d i c e s o u t Vector o f i n d i c e s o f s o r t ed va l u e s in 52 −−! s o r t ed order 53 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 54 55 entity s o r t i s 56 generic ( 57 WORD SIZE : natura l := 32 58 ) ; 59 port ( 60 c l k : in s t d l o g i c ; 61 r s t n : in s t d l o g i c ; 105 62 u l i s t i n : in s t d l o g i c v e c t o r ( NUMBER OF CLASSES∗WORD SIZE −1 downto 0) ; 63 s l i s t o u t : out s t d l o g i c v e c t o r ( NUMBER OF CLASSES∗ WORD SIZE−1 downto 0) ; 64 s l i s t i n d i c e s o u t : out s t d l o g i c v e c t o r ( NUMBER OF CLASSES∗ natura l ( trunc ( log2 ( r e a l ( NUMBER OF CLASSES) ) ) )−1 downto 0) 65 ) ; 66 end entity ; 67 68 architecture r t l of s o r t i s 69 70 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 71 −− Component De f i n i t i o n s 72 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 73 component gt compare i s −−a > b −−> q = 1 74 port ( 75 a : in s t d l o g i c v e c t o r (31 downto 0) := ( others => ’ 0 ’ ) ; 76 a r e s e t : in s t d l o g i c := ’ 0 ’ ; 77 b : in s t d l o g i c v e c t o r (31 downto 0) := ( others => ’ 0 ’ ) ; 78 c l k : in s t d l o g i c := ’ 0 ’ ; 79 q : out s t d l o g i c v e c t o r (0 downto 0) 80 ) ; 81 end component gt compare ; 82 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 83 −− Constant De f i n i t i o n s 84 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 85 constant INDEX BITS : natura l := natura l ( trunc ( log2 ( r e a l ( 86 NUMBER OF CLASSES) ) ) ) ; 87 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 88 −− Type De f i n i t i o n s 89 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 90 type l i s t a r r a y i s array (1 to NUMBER OF CLASSES) of s t d l o g i c v e c t o r (31 downto 0) ; 91 type po s i t i o n a r r a y i s array (1 to NUMBER OF CLASSES) of i n t e g e r range 0 to NUMBER OF CLASSES; 92 type r e s u l t a r r a y i s array (1 to NUMBER OF CLASSES) of s t d l o g i c ; 93 type r e su l t expand a r r ay i s array (1 to NUMBER OF CLASSES) of r e s u l t a r r a y ; 94 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 95 −− S igna l De f i n i t i o n s 96 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 97 signal unsorted , unso r t ed reg : l i s t a r r a y ; 98 signal r e s u l t : r e su l t expand a r r ay ; 106 99 signal s o r t ed index : p o s i t i o n a r r a y ; 100 signal r e s e t : s t d l o g i c ; 101 102 begin 103 104 r e s e t <= not r s t n ; 105 106 g compare : for j in 1 to NUMBER OF CLASSES generate 107 108 unsorted ( j ) <= u l i s t i n ( (NUMBER OF CLASSES−( j−1) ) ∗ WORD SIZE−1 downto (NUMBER OF CLASSES−j ) ∗WORD SIZE) ; 109 110 g inner compare : for k in 1 to NUMBER OF CLASSES generate 111 i compare : gt compare 112 port map( 113 a => unsorted ( j ) , 114 a r e s e t => r e s e t , 115 b => unsorted (k ) , 116 c l k => c lk , 117 q (0 ) => r e s u l t ( j ) ( k ) 118 ) ; 119 end generate ; 120 121 end generate ; 122 123 process ( c lk , r s t n ) 124 variable sum index : p o s i t i o n a r r a y ; 125 begin 126 i f ( r s t n = ’0 ’ ) then 127 s o r t ed index <= ( others => 0) ; 128 sum index := ( others => 0) ; 129 s l i s t i n d i c e s o u t <= ( others => ’ 0 ’ ) ; 130 s l i s t o u t <= ( others => ’ 0 ’ ) ; 131 e l s i f ( r i s i n g e d g e ( c l k ) ) then 132 unso r t ed reg <= unsorted ; 133 sum index := ( others => 0) ; 134 for j in 1 to NUMBER OF CLASSES loop 135 for k in 1 to NUMBER OF CLASSES loop 136 i f ( k >= j+1) then 137 i f ( r e s u l t ( j ) ( k ) = ’1 ’ ) then 138 sum index ( j ) := sum index ( j ) + 1 ; 139 else 140 sum index (k ) := sum index (k ) + 1 ; 141 end i f ; 142 end i f ; 143 end loop ; 107 144 s o r t ed index ( j ) <= sum index ( j ) − 1 ; −−s t a r t from 0 145 s l i s t i n d i c e s o u t (INDEX BITS∗(NUMBER OF CLASSES− 146 ( s o r t ed index ( j ) ) )−1 downto INDEX BITS∗( 147 NUMBER OF CLASSES−( s o r t ed index ( j )+1) ) ) <= 148 s t d l o g i c v e c t o r ( to uns igned ( j , INDEX BITS) ) ; 149 −−ordered l e a s t to g r e a t e s t 150 s l i s t o u t (WORD SIZE∗(NUMBER OF CLASSES− s o r t ed index ( j ) )−1 151 downto WORD SIZE∗(NUMBER OF CLASSES−( s o r t ed index ( j )+1) ) ) 152 <= unsor t ed reg ( j ) ; 153 end loop ; 154 end i f ; 155 end process ; 156 end architecture ; 1 −− −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 2 −− 3 −−! @ f i l e o b j e c t t r a c k i n g . vhd 4 −−! @br ie f Bui ld s up c l a s s i f i c a t i o n based on o b j e c t edges 5 −−! @de t a i l s Uses input from hype r s p e c t r a l c l a s s i f i c a t i o n s and 6 −−! monochrome edge d e t e c t i on to compi le o b j e c t 7 −−! c l a s s i f i c a t i o n s over the i d e n t i f i e d p i x e l s . 8 −−! @author Monica Whitaker 9 −−! @date August 2016 10 −−! @copyright Copyright (C) 2016 Ross K. Snider and 11 −−! Monica Whitaker 12 −− 13 −− This program i s f r e e so f tware : you can r e d i s t r i b u t e i t and/or 14 −− modify i t under the terms o f the GNU General Pub l i c License 15 −− as pub l i s h ed by the Free Sof tware Foundation , e i t h e r ve r s i on 16 −− 3 o f the License , or ( at your opt ion ) any l a t e r ve r s i on . 17 −− 18 −− This program i s d i s t r i b u t e d in the hope t ha t i t w i l l be 19 −− use fu l , but WITHOUT ANY WARRANTY; wi thout even the imp l i ed 20 −− warranty o f MERCHANTABILITY or FITNESS FOR A PARTICULAR 21 −− PURPOSE. See the GNU General Pub l i c License f o r more d e t a i l s . 22 −− 23 −− You shou ld have r e c e i v ed a copy o f the GNU General Pub l i c 24 −− License a long wi th t h i s program . I f not , see . 25 −− 26 −− Monica Whitaker 27 −− E l e c t r i c a l and Computer Engineer ing 108 28 −− Montana S ta t e Un i v e r s i t y 29 −− 610 Cob le i gh Ha l l 30 −− Bozeman , MT 59717 31 −− monica . whitaker@msu . montana . edu 32 −− 33 −− −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 34 l ibrary IEEE ; −−! Use standard l i b r a r y . 35 use IEEE . STD LOGIC 1164 .ALL ; −−! Use standard l o g i c e lements . 36 use IEEE .NUMERIC STD.ALL ; −−! Use numeric s tandard . 37 use IEEE .MATHREAL.ALL; −−! Use r e a l math l i b r a r y 38 39 use work . Sensor Package . a l l ; −−! Pro jec t cons tan t s package f i l e 40 −− −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 41 −− 42 −−! @br ie f o b j e c t t r a c k i n g 43 −−! @de t a i l s Uses input from hype r s p e c t r a l c l a s s i f i c a t i o n s and 44 −−! monochrome edge d e t e c t i on to compi le o b j e c t 45 −−! c l a s s i f i c a t i o n s over the i d e n t i f i e d p i x e l s . 46 −−! Keeps array o f o b j e c t numbers based on p i x e l 47 −−! number . 48 −−! @param MAXOBJECTNUMBER Maximum number o f o b j e c t s 49 −−! p o s s i b l e a t any one time 50 −−! @param WORD SIZE Standard s i z e o f f l o a t i n g po in t data 51 −−! @param l i n e s c a n c l k Input c l k from transmiss ion o f 52 −−! monochrome data 53 −−! @param da t a c l k Input c l o c k from hype r s p e c t r a l 54 −−! c l a s s i f i c a t i o n 55 −−! @param f a s t c l k Input c l o c k running at t r i p l e 56 −−! the speed o f the d a t a c l k 57 −−! @param r s t n Act ive low r e s e t 58 −−! @param l i n e r s t n Act ive low r e s e t f o r l i n e s c a n c l k domain 59 −−! @param l i n e s c an o b j Informat ion about o b j e c t 60 −−! l o c a t i o n from l i n e s can camera 61 −−! Contains l i n e number , o b j e c t 62 −−! number , s t a r t p i x e l , end 63 −−! p i x e l 64 −−! @param new re su l t s Flag to i n d i c a t e new 65 −−! h y p e r s p e c t r a l p i x e l r e s u l t s 109 66 −−! @param c l a s s r e s u l t s i n Hyper spec t ra l r e s u l t s v e c t o r 67 −−! o f c l a s s p r o b a b i l i t i e s wi th 68 −−! p i x e l number 69 −−! @param de c i s i o n v e c t o r Vector o f o v e r a l l 70 −−! p r o b a b i l i t i e s f o r c l a s s e s 71 −−! and o b j e c t number . 72 −− −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 73 entity ob j e c t t r a c k i n g i s 74 generic (MAXOBJECTNUMBER : natura l := 64 ; 75 WORD SIZE : natura l := 32 76 ) ; 77 port ( l i n e s c a n c l k : in s t d l o g i c ; 78 da ta c l k : in s t d l o g i c ; 79 f a s t c l k : in s t d l o g i c ; 80 r s t n : in s t d l o g i c ; 81 l i n e r s t n : in s t d l o g i c ; 82 l i n e s c a n ob j : in s t d l o g i c v e c t o r ( PIXEL ADDRESS SIZE∗2+OBJECT ADDRESS SIZE+WORD SIZE−1 downto 0) ; 83 new re su l t s : in s t d l o g i c ; 84 c l a s s r e s u l t s i n : in s t d l o g i c v e c t o r ( NUMBER OF CLASSES∗WORD SIZE+PIXEL ADDRESS SIZE−1 downto 0) ; 85 d e c i s i o n v e c t o r : out s t d l o g i c v e c t o r ( NUMBER OF CLASSES∗WORD SIZE+OBJECT ADDRESS SIZE−1 downto 0) 86 ) ; 87 end entity ; 88 89 architecture arch of ob j e c t t r a c k i n g i s 90 91 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 92 −− Component De f i n i t i o n s 93 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 94 component memory block i s 95 generic ( 96 num elements a : natura l ; 97 num elements b : natura l ; 98 s i z e a dd r e s s a : natura l ; 99 s i z e a dd r e s s b : natura l ; 100 s i z e word a : natura l ; 101 s i z e word b : natura l ; 102 mem init : s t r i n g := "UNUSED" 103 ) ; 104 port ( 110 105 addre s s a : in s t d l o g i c v e c t o r ( s i z e add r e s s a −1 downto 0) ; 106 addres s b : in s t d l o g i c v e c t o r ( s i z e add r e s s b −1 downto 0) ; 107 c l o ck a : in s t d l o g i c := ’ 1 ’ ; 108 c l o ck b : in s t d l o g i c := ’ 1 ’ ; 109 data a : in s t d l o g i c v e c t o r ( s i z e word a−1 downto 0) ; 110 data b : in s t d l o g i c v e c t o r ( s i ze word b−1 downto 0) ; 111 wren a : in s t d l o g i c := ’ 0 ’ ; 112 wren b : in s t d l o g i c := ’ 0 ’ ; 113 q a : out s t d l o g i c v e c t o r ( s i z e word a−1 downto 0) ; 114 q b : out s t d l o g i c v e c t o r ( s i ze word b−1 downto 0) 115 ) ; 116 end component memory block ; 117 118 component fp func add i s −−3 c y c l e l a t ency 119 port ( 120 a : in s t d l o g i c v e c t o r (31 downto 0) := ( others => ’ 0 ’ ) ; 121 a r e s e t : in s t d l o g i c := ’ 0 ’ ; 122 b : in s t d l o g i c v e c t o r (31 downto 0) := ( others => ’ 0 ’ ) ; 123 c l k : in s t d l o g i c := ’ 0 ’ ; 124 q : out s t d l o g i c v e c t o r (31 downto 0) 125 ) ; 126 end component fp func add ; 127 128 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 129 −− Constant De f i n i t i o n s 130 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 131 constant LINESCAN INPUT SIZE : natura l := WORD SIZE + OBJECT ADDRESS SIZE + PIXEL ADDRESS SIZE∗2 ; 132 −−Valid va l u e s = 1 ,2 ,4 ,8 ,16 133 constant MEMORYRATIO : natura l := NUMBER OF CLASSES; 134 constant CLASS NUMBER : natura l := natura l ( trunc ( log2 ( r e a l ( NUMBER OF CLASSES) ) ) ) ; 135 constant ZEROS : s t d l o g i c v e c t o r (MEMORYRATIO∗WORD SIZE−1 downto 0) := ( others => ’ 0 ’ ) ; 136 137 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 138 −− Type De f i n i t i o n s 139 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 111 140 type p i x e l a r r a y i s array (0 to NUMBER OF PIXELS) of i n t e g e r range 0 to MAXOBJECTNUMBER; 141 142 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 143 −− S igna l De f i n i t i o n s 144 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 145 signal p i x e l t r a c k e r , p r e v i o u s l i n e p i x e l s , p a s t l i n e : p i x e l a r r a y ; 146 signal r e s e t : s t d l o g i c ; 147 signal f rame count : s t d l o g i c v e c t o r (WORD SIZE−1 downto 0) ; 148 signal r e g r e s s l i n e : s t d l o g i c v e c t o r (WORD SIZE−1 downto 0) ; 149 signal r eg reg , r e g l a t c h : s t d l o g i c v e c t o r (WORD SIZE−1 downto 0) ; 150 signal p a s t l i n e s c a n l i n e : s t d l o g i c v e c t o r (WORD SIZE−1 downto 0) ; 151 152 signal update mem address : s t d l o g i c v e c t o r ( OBJECT ADDRESS SIZE−1 downto 0) ; 153 signal dec i s i on ve c to r t emp : s t d l o g i c v e c t o r (MEMORYRATIO∗ WORD SIZE−1 downto 0) ; 154 signal mem write : s t d l o g i c ; 155 signal o b j e c t c l e a r w r i t e : s t d l o g i c ; 156 signal r eady wr i t e : s t d l o g i c ; 157 signal r eady wr i t e2 : s t d l o g i c ; 158 signal mem pixel : s t d l o g i c v e c t o r ( NUMBER OF CLASSES∗WORD SIZE−1 downto 0) ; 159 signal combined pixe l : s t d l o g i c v e c t o r ( NUMBER OF CLASSES∗WORD SIZE−1 downto 0) ; 160 signal new pix add : s t d l o g i c v e c t o r ( NUMBER OF CLASSES∗WORD SIZE−1 downto 0) ; 161 signal new pixe l : s t d l o g i c v e c t o r ( NUMBER OF CLASSES∗WORD SIZE−1 downto 0) ; 162 163 signal output mem address : s t d l o g i c v e c t o r ( OBJECT ADDRESS SIZE−1 downto 0) ; 164 signal out object num : s t d l o g i c v e c t o r ( OBJECT ADDRESS SIZE−1 downto 0) ; 165 signal newlinenum : s t d l o g i c v e c t o r (WORD SIZE−1 downto 0) ; 166 signal p ix s t a r t , pixend : s t d l o g i c v e c t o r ( PIXEL ADDRESS SIZE−1 downto 0) ; 167 168 signal object num : s t d l o g i c v e c t o r ( OBJECT ADDRESS SIZE−1 downto 0) ; 169 112 170 signal s t a r t l i n e : s t d l o g i c ; 171 signal r e g s t a r t l i n e : s t d l o g i c ; 172 signal r e g s t a r t l i n e 2 : s t d l o g i c ; 173 174 attribute noprune : boolean ; 175 attribute noprune of p i x e l t r a c k e r : signal i s t rue ; 176 177 begin 178 179 ASSERT (MEMORYRATIO >= NUMBER OF CLASSES) 180 report "Invalid number of classes for memory block" 181 severity e r r o r ; 182 183 r e s e t <= not r s t n ; 184 185 i c l a s s r e s u l t mem : memory block −−update on b , output on a 186 generic map( 187 num elements a => MAXOBJECTNUMBER, 188 num elements b => MAXOBJECTNUMBER, 189 s i z e a dd r e s s a => OBJECT ADDRESS SIZE, 190 s i z e a dd r e s s b => OBJECT ADDRESS SIZE, 191 s i z e word a => NUMBER OF CLASSES ∗ WORD SIZE, 192 s i z e word b => NUMBER OF CLASSES ∗ WORD SIZE, 193 mem init => "UNUSED" 194 ) 195 port map( 196 addre s s a => output mem address , 197 addres s b => update mem address , 198 c l o ck a => data c lk , 199 c l o ck b => data c lk , 200 data a => ( others => ’ 0 ’ ) , 201 data b => combined pixe l , 202 wren a => ob j e c t c l e a r w r i t e , 203 wren b => mem write , 204 q a => dec i s i on vec to r t emp , −−r e g i s t e r e d 205 q b => mem pixel 206 ) ; 207 208 accumulate : for k in 1 to NUMBER OF CLASSES generate 209 210 i add fp func add : fp func add 211 port map( 212 a => mem pixel (WORD SIZE∗(NUMBER OF CLASSES−(k−1) ) −1 downto WORD SIZE∗(NUMBER OF CLASSES−k ) ) , 213 a r e s e t => r e s e t , 214 b => new pix add (WORD SIZE∗(NUMBER OF CLASSES−(k−1) )−1 downto WORD SIZE∗(NUMBER OF CLASSES−k ) ) , 113 215 c l k => f a s t c l k , 216 q => combined pixe l (WORD SIZE∗(NUMBER OF CLASSES−(k −1) )−1 downto WORD SIZE∗(NUMBER OF CLASSES−k ) ) 217 ) ; 218 219 end generate ; 220 221 222 −−input from l i n e s can < l i n e#, o b j e c t#, s t a r t pix , end pix> 223 a c c e p t p i x e l s : process ( l i n e s c an c l k , l i n e r s t n ) 224 variable c u r r e n t l i n e s c a n l i n e : s t d l o g i c v e c t o r ( WORD SIZE−1 downto 0) ; 225 begin 226 i f ( l i n e r s t n = ’0 ’ ) then 227 p r e v i o u s l i n e p i x e l s <= ( others => 0) ; 228 p i x e l t r a c k e r <= ( others => 0) ; 229 r e g r e s s l i n e <= ( others => ’ 0 ’ ) ; 230 p a s t l i n e s c a n l i n e <= ( others => ’ 0 ’ ) ; 231 e l s i f ( r i s i n g e d g e ( l i n e s c a n c l k ) ) then 232 −−l i n e coun t r e s e t = 233 i f ( l i n e s c a n ob j (PIXEL ADDRESS SIZE−1 downto 0) = 234 s t d l o g i c v e c t o r ( to uns igned (NUMBER OF PIXELS−1, 235 PIXEL ADDRESS SIZE) ) and l i n e s c a n ob j ( PIXEL ADDRESS SIZE ∗ 236 2 − 1 downto PIXEL ADDRESS SIZE) = s t d l o g i c v e c t o r ( 237 to uns igned (0 ,PIXEL ADDRESS SIZE) ) ) then 238 s t a r t l i n e <= ’1 ’ ; 239 else 240 s t a r t l i n e <= ’0 ’ ; 241 c u r r e n t l i n e s c a n l i n e := l i n e s c a n ob j ( LINESCAN INPUT SIZE−1 downto LINESCAN INPUT SIZE−WORD SIZE) ; 242 object num <= l i n e s c a n ob j (LINESCAN INPUT SIZE− WORD SIZE−1 downto PIXEL ADDRESS SIZE∗2) ; 243 pixend <= l i n e s c a n ob j (PIXEL ADDRESS SIZE−1 downto 0) ; 244 p i x s t a r t <= l i n e s c a n ob j (PIXEL ADDRESS SIZE ∗2−1 downto PIXEL ADDRESS SIZE) ; 245 newlinenum <= l i n e s c a n ob j (LINESCAN INPUT SIZE−1 downto LINESCAN INPUT SIZE−WORD SIZE) ; 246 247 i f ( unsigned ( c u r r e n t l i n e s c a n l i n e ) /= 248 unsigned ( p a s t l i n e s c a n l i n e ) ) then 249 −−new l i n e 250 p r e v i o u s l i n e p i x e l s <= p i x e l t r a c k e r ; 251 p i x e l t r a c k e r <= ( others => 0) ; 114 252 end i f ; 253 254 for k in 1 to NUMBER OF PIXELS loop 255 exit when k = unsigned ( pixend ) + 1 ; 256 i f ( k >= unsigned ( p i x s t a r t ) and k <= unsigned ( pixend ) ) 257 then 258 −−OBJECT NUMBER; 259 p i x e l t r a c k e r ( k ) <= to i n t e g e r ( unsigned ( object num ) ) ; 260 end i f ; 261 end loop ; 262 p a s t l i n e s c a n l i n e <= cu r r e n t l i n e s c a n l i n e ; 263 r e g r e s s l i n e <= newlinenum ; 264 end i f ; 265 end i f ; 266 end process ; 267 268 269 −−one p i x e l r e s u l t a t a time , j u s t add in as needed ! 270 −−INPUT =

271 process ( data c lk , r s t n ) 272 variable pixel num : i n t e g e r range 0 to NUMBER OF PIXELS−1; 273 variable c u r r e n t l i n e : p i x e l a r r a y ; 274 variable regress f rame num : s t d l o g i c v e c t o r (WORD SIZE −1 downto 0) ; 275 begin 276 i f ( r s t n = ’0 ’ ) then 277 new pixe l <= ( others => ’ 0 ’ ) ; 278 regress f rame num := ( others => ’ 0 ’ ) ; 279 update mem address <= ( others => ’ 0 ’ ) ; 280 output mem address <= ( others => ’ 0 ’ ) ; 281 r eady wr i t e <= ’0 ’ ; 282 e l s i f ( r i s i n g e d g e ( da ta c l k ) ) then 283 r e g s t a r t l i n e 2 <= s t a r t l i n e ; 284 r e g s t a r t l i n e <= r e g s t a r t l i n e 2 ; 285 286 r e g r e g <= r e g r e s s l i n e ; 287 r e g l a t c h <= reg r e g ; 288 289 i f ( r e g s t a r t l i n e = ’1 ’ ) then 290 regress f rame num := ( others => ’ 0 ’ ) ; 291 e l s i f ( n ew re su l t s = ’1 ’ ) then 292 pixel num := t o i n t e g e r ( unsigned ( c l a s s r e s u l t s i n (NUMBER OF CLASSES∗WORD SIZE+ 115 PIXEL ADDRESS SIZE−1 downto NUMBER OF CLASSES∗ WORD SIZE) ) ) ; 293 i f ( pixel num = 0) then 294 regress f rame num := s t d l o g i c v e c t o r ( unsigned ( regress f rame num ) + 1) ; 295 p a s t l i n e <= cu r r e n t l i n e ; 296 i f ( unsigned ( regress f rame num ) = unsigned ( r e g l a t c h ) ) then 297 c u r r e n t l i n e := p i x e l t r a c k e r ; 298 else 299 c u r r e n t l i n e := p r e v i o u s l i n e p i x e l s ; 300 end i f ; 301 end i f ; 302 end i f ; 303 304 i f ( n ew re su l t s = ’1 ’ ) then 305 i f ( pixel num > 0 and pixel num < NUMBER OF PIXELS −1) then 306 i f ( c u r r e n t l i n e ( pixel num−1) /= 0 and 307 c u r r e n t l i n e ( pixel num ) /= 0 and 308 c u r r e n t l i n e ( pixel num+1) /= 0) then 309 −−read from memory , add toge ther , re− wr i t e to memory 310 new pixe l <= c l a s s r e s u l t s i n ( NUMBER OF CLASSES∗ 311 WORD SIZE−1 downto 0) ; 312 313 update mem address <= s t d l o g i c v e c t o r ( to uns igned 314 ( c u r r e n t l i n e ( pixel num ) , OBJECT ADDRESS SIZE) ) ; 315 r eady wr i t e <= ’1 ’ ; 316 e l s i f ( c u r r e n t l i n e ( pixel num−1) = 0 and 317 p a s t l i n e ( pixel num−1) /= 0) then 318 319 i f ( c u r r e n t l i n e ( pixel num ) = 0 and 320 p a s t l i n e ( pixel num ) /= 0) then 321 322 output mem address <= s t d l o g i c v e c t o r ( 323 to uns igned ( p a s t l i n e ( pixel num ) , 324 OBJECT ADDRESS SIZE) ) ; 325 end i f ; 326 r eady wr i t e <= ’0 ’ ; 327 new pixe l <= ( others => ’ 0 ’ ) ; 328 else 116 329 r eady wr i t e <= ’0 ’ ; 330 new pixe l <= ( others => ’ 0 ’ ) ; 331 end i f ; 332 else 333 r eady wr i t e <= ’0 ’ ; 334 new pixe l <= ( others => ’ 0 ’ ) ; 335 end i f ; 336 else 337 new pixe l <= ( others => ’ 0 ’ ) ; 338 r eady wr i t e <= ’0 ’ ; 339 340 end i f ; 341 new pix add <= new pixe l ; 342 r eady wr i t e2 <= ready wr i t e ;−−p i p e l i n e wh i l e adder opera t e s 343 mem write <= ready wr i t e2 ; 344 end i f ; 345 end process ; 346 347 output proc : process ( data c lk , r s t n ) 348 begin 349 i f ( r s t n = ’0 ’ ) then 350 d e c i s i o n v e c t o r <= ( others => ’ 0 ’ ) ; 351 e l s i f ( r i s i n g e d g e ( da ta c l k ) ) then 352 out object num <= output mem address ; 353 i f ( d e c i s i on ve c to r t emp /= ZEROS) then 354 d e c i s i o n v e c t o r <= out object num & dec i s i on vec to r t emp ; 355 o b j e c t c l e a r w r i t e <= ’1 ’ ; 356 else 357 d e c i s i o n v e c t o r <= ( others => ’ 0 ’ ) ; 358 o b j e c t c l e a r w r i t e <= ’0 ’ ; 359 end i f ; 360 end i f ; 361 end process ; 362 363 end architecture ; 1 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 2 −− 3 −−! @ f i l e DRAM controller . vhd 4 −−! @br ie f The master d r i v e r to p u l l data from DRAM. 5 −−! @de t a i l s Passes bu r s t i n g reads from DRAM through b u f f e r 6 −−! f o r use by system 7 −−! @author Monica Whitaker 8 −−! @date October 2015 9 −−! @copyright Copyright (C) 2015 Ross K. Snider and 117 10 −−! Monica Whitaker 11 −− 12 −− This program i s f r e e so f tware : you can r e d i s t r i b u t e i t and/or 13 −− modify i t under the terms o f the GNU General Pub l i c License 14 −− as pub l i s h ed by the Free Sof tware Foundation , e i t h e r ve r s i on 15 −− 3 o f the License , or ( at your opt ion ) any l a t e r ve r s i on . 16 −− 17 −− This program i s d i s t r i b u t e d in the hope t ha t i t w i l l be 18 −− use fu l , but WITHOUT ANY WARRANTY; wi thout even the imp l i ed 19 −− warranty o f MERCHANTABILITY or FITNESS FOR A PARTICULAR 20 −− PURPOSE. See the GNU General Pub l i c License f o r more d e t a i l s . 21 −− 22 −− You shou ld have r e c e i v ed a copy o f the GNU General Pub l i c 23 −− License a long wi th t h i s program . I f not , see . 24 −− 25 −− Monica Whitaker 26 −− E l e c t r i c a l and Computer Engineer ing 27 −− Montana S ta t e Un i v e r s i t y 28 −− 610 Cob le i gh Ha l l 29 −− Bozeman , MT 59717 30 −− monica . whitaker@msu . montana . edu 31 −− 32 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 33 l ibrary IEEE ; 34 use IEEE . STD LOGIC 1164 .ALL; 35 use i e e e . numer ic std . a l l ; −−! Use numeric s tandard 36 use i e e e . math rea l . a l l ; 37 38 use work . Sensor Package .ALL; 39 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 40 −− 41 −−! @br ie f DRAM controller 42 −−! @de t a i l s Passes bu r s t i n g reads from DRAM through b u f f e r 43 −−! f o r use by system 44 −−! @param memory clk Input r e f c l o c k f o r DDR 45 −−! @param sy s t em c l k Buf fer data output c l o c k 46 −−! @param r s t n Act ive low r e s e t 47 −−! @param avm read master read Master Read enab l e 48 −−! @param avm read master address Master address 49 −−! @param avm read master burs tcount Master bur s t coun t 50 −−! @param avm read master readdata Master readdata 51 −−! @param avm read mas ter readda tava l i d Master data v a l i d 118 52 −−! @param avm read mas ter wa i t reques t Master read wa i t r e que s t 53 −−! @param avm wr i t e mas te r wr i t e Master wr i t e enab l e 54 −−! @param avm wri te mas ter address Master wr i t e address 55 −−! @param avm wr i t e mas te r wr i t eda ta Master wr i t eda ta 56 −−! @param avm wr i t e mas t e r wa i t r e que s t Master wr i t e wa i t r e que s t 57 −−! @param a v s c s r w r i t e S lave wr i t e enab l e 58 −−! @param av s c s r a dd r e s s S lave wr i t e address 59 −−! @param av s c s r w r i t e d a t a S lave wr i t eda ta 60 −−! @param av s c s r wa i t r e q u e s t S lave wr i t e wa i t r e que s t 61 −−! @param w r i t e c l k Output o f memory clk 62 −−! @param r e a d s t a r t Enable read ing from DDR 63 −−! @param bu f f e r r e a d en Read enab l e f o r FIFO 64 −−! @param bu f f e r empty FIFO empty 65 −−! @param bu f f e r r e a dda t a FIFO readdata 66 −− 67 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 68 entity DRAM controller i s 69 port ( memory clk : in s t d l o g i c ; 70 sys t em c lk : in s t d l o g i c ; 71 r s t n : in s t d l o g i c ; 72 73 −−read master s i g n a l s 74 avm read master read : out s t d l o g i c ; 75 avm read master address : out s t d l o g i c v e c t o r (31 downto 0) ; 76 avm read master burstcount : out s t d l o g i c v e c t o r (5 downto 0) ; 77 avm read master readdata : in s t d l o g i c v e c t o r (127 downto 0) ; 78 avm read master readdatava l id : in s t d l o g i c ; 79 avm read master wai t request : in s t d l o g i c ; 80 81 −−wr i t e master s i g n a l s −− debug wr i t i n g s i g n a l s 82 avm wr i te master wr i t e : out s t d l o g i c ; 83 avm wri te master address : out s t d l o g i c v e c t o r (31 downto 0) ; 119 84 avm wri te master wr i tedata : out s t d l o g i c v e c t o r (127 downto 0) ; 85 avm wr i te maste r wa i t reques t : in s t d l o g i c ; 86 87 −−expor t s i g n a l s f o r wr i t i n g 88 a v s c s r w r i t e : in s t d l o g i c ; 89 av s c s r add r e s s : in s t d l o g i c v e c t o r (31 downto 0) ; 90 av s c s r w r i t e d a t a : in s t d l o g i c v e c t o r (127 downto 0) ; 91 av s c s r wa i t r e qu e s t : out s t d l o g i c ; 92 wr i t e c l k : out s t d l o g i c ; 93 94 −−condui t expor t s i g n a l s 95 r e a d s t a r t : in s t d l o g i c ; −− 1 i f wr i t e done 96 bu f f e r r e ad en : in s t d l o g i c ; 97 buf fer empty : out s t d l o g i c ; 98 bu f f e r r e adda ta : out s t d l o g i c v e c t o r (127 downto 0) 99 ) ; 100 end entity ; 101 102 architecture c o n t r o l l e r a r c h of DRAM controller i s 103 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 104 −− Component De f i n i t i o n s 105 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 106 component d u a l c l o c k f i f o i s 107 generic ( 108 lpm numwords : natura l ; 109 lpm width : natura l ; 110 lpm widthu : natura l ; 111 rd sync de layp ipe : natura l ; 112 under f l ow check ing : s t r i n g ; 113 wrsync de layp ipe : natura l ) ; 114 port ( 115 data : in s t d l o g i c v e c t o r ( lpm width − 1 downto 0) 116 := ( others => ’X’ ) ; 117 wrreq : in s t d l o g i c := ’X’ ; 118 rdreq : in s t d l o g i c := ’X’ ; 119 wrclk : in s t d l o g i c := ’X’ ; 120 rdc lk : in s t d l o g i c := ’X’ ; 121 a c l r : in s t d l o g i c := ’ 0 ’ ; 122 q : out s t d l o g i c v e c t o r ( lpm width − 1 downto 0) ; 123 rdempty : out s t d l o g i c ; 124 wr f u l l : out s t d l o g i c ; 120 125 r d f u l l : out s t d l o g i c ; 126 wrempty : out s t d l o g i c ; 127 rdusedw : out s t d l o g i c v e c t o r ( lpm widthu − 1 downto 0) ; 128 wrusedw : out s t d l o g i c v e c t o r ( lpm widthu − 1 downto 0) ; 129 e c c s t a tu s : out s t d l o g i c v e c t o r (1 downto 0) ) ; 130 end component d u a l c l o c k f i f o ; 131 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 132 −− Constant De f i n i t i o n s 133 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 134 constant BURST LENGTH : natura l := 32 ; 135 constant BURST LENGTH SIZE : natura l := 6 ; 136 constant BUFFERDEPTH : natura l := 1024 ; 137 constant READDATA SIZE : natura l := DRAM DATA SIZE; 138 constant TOTAL BURSTS : natura l := natura l ( trunc ( r e a l ( (NUMBER OF PIXELS∗NUMBER OF SPECTRAL BINS) / BURST LENGTH) ) ) ; 139 constant BYTES PERWORD : natura l := natura l ( trunc ( r e a l (READDATA SIZE) / r e a l (8 ) ) ) ; 140 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 141 −− Type De f i n i t i o n s 142 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 143 −− s t a t e machine s t a t e s 144 type r e ad s t a t e s T i s ( i d l e , 145 f i f o w a i t , 146 mid burst , 147 f i n i s h r e a d s ) ; 148 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 149 −− S igna l Dec la ra t i ons 150 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 151 −− f i f o s i g n a l s 152 signal bu f f e r w r i t e : s t d l o g i c ; 153 signal b u f f e r f u l l : s t d l o g i c ; 154 signal bu f f e r words : s t d l o g i c v e c t o r (9 downto 0) ; 155 156 signal r e ad s t a t e : r e ad s t a t e s T ; 157 158 −− ex t ra read master s i g n a l s 159 −− the current read address 160 signal r ead addre s s : s t d l o g i c v e c t o r (31 downto 0) ; 161 −− t r a c k s the number o f b u r s t s completed 162 signal burs t s comple ted : s t d l o g i c v e c t o r ( natura l ( trunc ( log2 ( r e a l (TOTAL BURSTS) ) ) ) downto 0) ; 163 −− t r a c k s the a v a i l a b l e room in the f i f o 164 signal r o om i n f i f o : s t d l o g i c v e c t o r (10 downto 0) ; 121 165 −− t r a c k s the number o f t r an sa c t i on s t ha t are wa i t ing to be re turned 166 signal pending reads : s t d l o g i c v e c t o r (10 downto 0) ; 167 168 −− ex t ra wr i t e master s i g n a l s 169 −− the current wr i t e address 170 signal wr i t e add r e s s : s t d l o g i c v e c t o r (31 downto 0) ; 171 −− t r ack number o f va l u e s wr i t t en 172 signal counter : i n t e g e r range 0 to TOTAL BURSTS∗ BURST LENGTH+1; 173 −− DEBUG: a l e r t read FSM when wr i t i n g complete 174 signal counter check : s t d l o g i c ; 175 signal s t a r t add r e s s 1 : s t d l o g i c v e c t o r (31 downto 0) := x" 00000000" ; 176 177 begin 178 av s c s r wa i t r e qu e s t <= avm wr i te maste r wa i t reques t ; 179 wr i t e c l k <= memory clk ; 180 avm wri te master address <= av s c s r add r e s s ; 181 avm wr i te master wr i te <= av s c s r w r i t e ; 182 avm wri te master wr i tedata <= av s c s r w r i t e d a t a ; 183 184 i d c f i f o b u f f e r : component d u a l c l o c k f i f o 185 generic map( 186 lpm numwords => BUFFER DEPTH, 187 lpm width => DRAM DATA SIZE, 188 lpm widthu => 10 , 189 rd sync de layp ipe => 4 , 190 under f l ow check ing => "OFF" , 191 wrsync de layp ipe => 4 192 ) 193 port map( 194 data => avm read master readdata , 195 wrreq => bu f f e r w r i t e , 196 rdreq => bu f f e r r e ad en , 197 wrclk => memory clk , 198 rdc lk => system clk , 199 q => bu f f e r r eaddata , 200 rdempty => buffer empty , 201 wr f u l l => b u f f e r f u l l , 202 a c l r => open , 203 e c c s t a tu s => open , 204 r d f u l l => open , 205 rdusedw => open , 206 wrempty => open , 207 wrusedw => bu f f e r words 208 ) ; 122 209 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 210 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 211 −−READ FSM 1 212 −− read l i g h t /dark matrix va l u e s −− addres se s x ”00000000” to x”0FFFFFFF” 213 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 214 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 215 read FSM 1 : process (memory clk , r s t n ) 216 begin 217 i f ( r s t n = ’0 ’ or r e a d s t a r t = ’0 ’ ) then 218 r e ad s t a t e <= i d l e ; 219 r ead addre s s <= s t a r t add r e s s 1 ; 220 burs t s comple ted <= ( others => ’ 0 ’ ) ; 221 pending reads <= ( others => ’ 0 ’ ) ; 222 e l s i f ( r i s i n g e d g e (memory clk ) ) then 223 224 −− DEFAULT SECTION 225 −− decrement the pending reads counter i f data i s re turned 226 i f ( avm read master readdatava l id = ’1 ’ ) then 227 pending reads <= s t d l o g i c v e c t o r ( unsigned ( pend ing reads ) − 1) ; 228 end i f ; 229 230 case r e ad s t a t e i s 231 −− IDLE 232 −− When i d l e j u s t s i t and wai t f o r the go f l a g . 233 −− Only s t a r t i f the wr i t e s t a t e machine i s i d l e as i t may 234 −− be f i n i s h i n g a prev ious data t r an s f e r . 235 −− S ta r t the machine by moving to the f i f o w a i t s t a t e and 236 −− i n i t i a l i s i n g address and counters . 237 when i d l e => 238 −− i f r e a d s t a r t = ’1 ’ then 239 r e ad s t a t e <= f i f o w a i t ; 240 r ead addre s s <= s t a r t add r e s s 1 ; 241 pending reads <= ( others => ’ 0 ’ ) ; 242 burs t s comple ted <= ( others => ’ 0 ’ ) ; 243 −−end i f ; 244 245 −− FIFO WAIT 246 −− When in t h i s s t a t e wai t f o r the f i f o to have s u f f i c i e n t 247 −− space f o r a complete bu r s t . I f so , s t a r t a bu r s t by 248 −− moving to the mid burs t s t a t e . When moving to mid bu r s t 123 249 −− add the bu r s t va lue to the pending reads counter . 250 when f i f o w a i t => 251 −− check t ha t f i f o has enough space f o r 32 word bu r s t 252 i f ( unsigned ( r o om i n f i f o ) >= BURST LENGTH + 5) then 253 r e ad s t a t e <= mid burst ; 254 −− add 32 to the pending reads counter but be 255 −− mindfu l t h a t a word may be re turned at the same 256 −− t ime 257 i f ( avm read master readdatava l id = ’0 ’ ) then 258 pending reads <= s t d l o g i c v e c t o r ( unsigned ( pend ing reads ) + BURST LENGTH) ; 259 else 260 pending reads <= s t d l o g i c v e c t o r ( unsigned ( pend ing reads ) + BURST LENGTH−1) ; 261 end i f ; 262 263 end i f ; 264 265 −− MID BURST 266 −− Count bu r s t s 267 −− I f a l l b u r s t s complete go to f i n i s h r e a d s s t a t e . 268 −− Otherwise s tay in t h i s s t a t e i f t h e r e i s room in f i f o or 269 −− re turn to f i f o w a i t i f not . As each bu r s t i s completed 270 −− increment address , b u r s t s completed counter and pending 271 −− reads counter . Be mindfu l to do noth ing i f wa i t r e que s t 272 −− i s a c t i v e 273 when mid burst => 274 −− i f wa i t r e que s t i s a c t i v e do nothing , o the rw i s e . . . 275 i f ( avm read master wai t request /= ’1 ’ ) then 276 i f ( burs t s comple ted = s t d l o g i c v e c t o r ( to uns igned ( TOTAL BURSTS − 1 , natura l ( trunc ( log2 ( r e a l ( TOTAL BURSTS) ) ) )+1) ) ) then 277 r e ad s t a t e <= f i n i s h r e a d s ; 278 −− no need to check f o r pending reads complete 279 −− as we ’ ve j u s t r e que s t ed another 32 words 280 else 281 burs t s comple ted <= s t d l o g i c v e c t o r ( unsigned ( burs t s comple ted ) + 1) ; 282 r ead addre s s <= s t d l o g i c v e c t o r ( unsigned ( r ead addre s s ) + BURST LENGTH∗BYTES PERWORD) ; 124 283 i f ( unsigned ( r o om i n f i f o ) >= BURST LENGTH + 5) 284 then 285 r e ad s t a t e <= mid burst ; 286 −− add 32 to the pending reads counter but 287 −− be mindfu l t h a t a word may be re turned 288 −− at the same time 289 i f ( avm read master readdatava l id = ’0 ’ ) then 290 pending reads <= s t d l o g i c v e c t o r ( unsigned ( pend ing reads ) + BURST LENGTH) ; 291 else 292 pending reads <= s t d l o g i c v e c t o r ( unsigned ( pend ing reads ) + BURST LENGTH − 1) ; 293 end i f ; 294 else 295 r e ad s t a t e <= f i f o w a i t ; 296 end i f ; 297 end i f ; 298 299 end i f ; 300 301 −− FINISH READS 302 −− Al l the read address phases are complete but t h e r e w i l l 303 −− be readdata pending . Jus t s i t and wai t u n t i l t h e r e i s no 304 −− readdata pending and then move to i d l e s t a t e . Note t ha t 305 −− the pend ing reads counter i s decremented in the d e f a u l t 306 −− s e c t i on above . 307 when f i n i s h r e a d s => 308 i f ( avm read master readdatava l id = ’1 ’ ) then 309 i f ( unsigned ( pend ing reads ) = 1) then 310 r e ad s t a t e <= i d l e ; 311 end i f ; 312 end i f ; 313 314 end case ; 315 end i f ; 316 end process ; 317 318 avm read master read <= ’1 ’ when r e ad s t a t e = mid burst else ’ 0 ’ ; 319 125 320 r o om i n f i f o <= s t d l o g i c v e c t o r ( r e s i z e ( ( to uns igned ( BUFFER DEPTH, natura l ( trunc ( log2 ( r e a l (BUFFERDEPTH) ) ) ) + 1) − unsigned ( bu f f e r words ) − unsigned ( pend ing reads ) ) , 11) ) ; 321 322 avm read master address <= read addre s s ; 323 324 −− s imply wr i t e data in t o the f i f o as i t comes in ( read a s s e r t e d and 325 −− wa i t r e que s t not a c t i v e ) 326 bu f f e r w r i t e <= avm read master readdatava l id ; 327 328 avm read master burstcount <= s t d l o g i c v e c t o r ( to uns igned ( BURST LENGTH, BURST LENGTH SIZE) ) ; 329 330 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 331 −− DEBUG sec t i on 332 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 333 −− −− Writes counter va l u e s f o r t e s t i n g purposes . 334 −− write FSM : process (memory clk , r s t n ) 335 −− beg in 336 −− i f ( r s t n = ’0 ’ ) then 337 −−wr i t e add r e s s <= s t a r t a d d r e s s 1 ; 338 −−counter check <= ’0 ’ ; 339 −−counter <= 0; 340 −−avm wr i t e mas te r wr i t e <= ’1 ’ ; 341 −− e l s i f ( r i s i n g e d g e (memory clk ) ) then 342 −− i f ( avm wr i t e mas t e r wa i t r e que s t /= ’1 ’ ) then 343 344 345 −− i f ( counter = TOTAL BURSTS∗BURST LENGTH+1) then 346 −− avm wr i t e mas te r wr i t e <= ’0 ’ ; 347 −− counter check <= ’1 ’ ; 348 −− e l s e 349 −− avm wr i t e mas t e r wr i t eda ta <= s t d l o g i c v e c t o r ( 350 −− t o uns i gned ( counter , READDATA SIZE) ) ; 351 −− counter <= counter + 1; 352 −− wr i t e add r e s s <= s t d l o g i c v e c t o r ( unsigned ( 353 −− wr i t e add r e s s ) + BYTES PERWORD) ; 354 −− end i f ; 355 −− end i f ; 356 −− end i f ; 357 −− end process ; 358 126 359 −− a v s c s r wa i t r e q u e s t <= avm wr i t e mas t e r wa i t r e que s t ; 360 361 end architecture ; 1 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 2 −− 3 −−! @ f i l e x c v r co r e . vhd 4 −−! @br ie f Transmission i n t e r f a c e 5 −−! @de t a i l s Contains t r an s c e i v e r phy and s e r i a l l i t e core f o r 6 −−! t ransmiss ion over t r an s c e i v e r s 7 −−! @author Monica Whitaker 8 −−! @date August 2016 9 −−! @copyright Copyright (C) 2016 Ross K. Snider and 10 −−! Monica Whitaker 11 −− 12 −− This program i s f r e e so f tware : you can r e d i s t r i b u t e i t and/or 13 −− modify i t under the terms o f the GNU General Pub l i c License 14 −− as pub l i s h ed by the Free Sof tware Foundation , e i t h e r ve r s i on 15 −− 3 o f the License , or ( at your opt ion ) any l a t e r ve r s i on . 16 −− 17 −− This program i s d i s t r i b u t e d in the hope t ha t i t w i l l be 18 −− use fu l , but WITHOUT ANY WARRANTY; wi thout even the imp l i ed 19 −− warranty o f MERCHANTABILITY or FITNESS FOR A PARTICULAR 20 −− PURPOSE. See the GNU General Pub l i c License f o r more d e t a i l s . 21 −− 22 −− You shou ld have r e c e i v ed a copy o f the GNU General Pub l i c 23 −− License a long wi th t h i s program . I f not , see . 24 −− 25 −− Monica Whitaker 26 −− E l e c t r i c a l and Computer Engineer ing 27 −− Montana S ta t e Un i v e r s i t y 28 −− 610 Cob le i gh Ha l l 29 −− Bozeman , MT 59717 30 −− monica . whitaker@msu . montana . edu 31 −− 32 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 33 l ibrary IEEE ; −−! Use standard l i b r a r y . 34 use IEEE . STD LOGIC 1164 .ALL; −−! Use standard l o g i c e lements . 35 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 36 −− 37 −−! @br ie f xcvr core 38 −−! @de t a i l s Contains t r an s c e i v e r phy and s e r i a l l i t e core f o r 39 −−! t ransmiss ion over t r an s c e i v e r s 40 −−! @param clk 100MHz Input c l k f o r phy management 127 41 −−! @param x c v r r e f c l k Transce iver p l l r e f e r ence c l o c k 42 −−! @param c l k da t a Clock f o r A t l an t i c i n t e r f a c e 43 −−! @param r e s e t Act ive h igh r e s e t 44 −−! @param re s e t n Act ive low r e s e t 45 −−! @param r x s e r i a l d a t a S e r i a l r e c e i v e r i n t e r f a c e 46 −−! @param t x s e r i a l d a t a S e r i a l t ransmiss ion i n t e r f a c e 47 −−! @param tx r eady Ready s i g n a l f o r t ransmiss ion 48 −−! @param rx ready Ready s i g n a l f o r r e c e i v e r 49 −−! @param s t a t r r l i n k Ind i c a t e s l i n k i s up 50 −−! @param tda t Data to t ransmi t 51 −−! @param tdav Data a v a i l a b l e 52 −−! @param tena Enable t ransmiss ion 53 −−! @param tsop Transmit s t a r t o f packe t 54 −−! @param teop Transmit end o f packe t 55 −−! @param t e r r Error in t ransmi t data 56 −−! @param tmty Number o f empty by t e s in 57 −−! t ransmi t data 58 −−! @param taddr Address o f packe t to send 59 −−! @param rdav Data a v a i l a b l e 60 −−! @param r va l Data v a l i d 61 −−! @param rdat Incoming data 62 −−! @param rsop Receiver s t a r t o f packe t s i g n a l 63 −−! @param reop Receiver end o f packe t s i g n a l 64 −−! @param rer r Receive error 65 −−! @param rmty Number o f empty b y t e s in 66 −−! r e c e i v ed data 67 −−! @param raddr Address o f packe t r e c e i v ed 68 −−! @param e r r r r c r c CRC error found 69 −−! @param r e c o n f i g r e s e t Reset f o r r e c on f i g u r a t i on i n t e r f a c e 70 −−! @param re con f i g r e ad Read r e que s t 71 −−! @param r e c on f i g w r i t e Write r e que s t 72 −−! @param re con f i g a dd r e s s Recon f i gura t ion address 73 −−! @param r e c on f i g w r i t e d a t a Data to wr i t e on 74 −−! r e c on f i g u r a t i on i n t e r f a c e 75 −−! @param r e c on f i g wa i t r e q u e s t Waitrequest from 76 −−! r e c on f i g u r a t i on i n t e r f a c e 77 −−! @param recon f i g r e adda t a Data read from 78 −−! r e c on f i g u r a t i on i n t e r f a c e 79 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 128 80 entity xcv r co r e i s 81 generic ( 82 NUMBER OF LANES : natura l := 1 ; 83 LANEWIDTH : natura l := 32 84 ) ; 85 port ( 86 clk 50MHz : in s t d l o g i c ; 87 x c v r r e f c l k : in s t d l o g i c ; 88 c lkdata : in s t d l o g i c ; 89 r e s e t : in s t d l o g i c ; 90 r e s e t n : in s t d l o g i c ; 91 r x s e r i a l d a t a : in s t d l o g i c ; 92 t x s e r i a l d a t a : out s t d l o g i c ; 93 94 tx ready : out s t d l o g i c ; 95 rx ready : out s t d l o g i c ; 96 97 s t a t r r l i n k : out s t d l o g i c ; 98 99 tdat : in s t d l o g i c v e c t o r ( ( ( NUMBER OF LANES ∗ LANEWIDTH)−1) downto 0) ; 100 tdav : out s t d l o g i c ; 101 tena : in s t d l o g i c ; 102 tsop : in s t d l o g i c ; 103 teop : in s t d l o g i c ; 104 t e r r : in s t d l o g i c ; 105 tmty : in s t d l o g i c v e c t o r (1 downto 0) ; 106 taddr : in s t d l o g i c v e c t o r (7 downto 0) ; 107 108 rdat : out s t d l o g i c v e c t o r ( ( ( NUMBER OF LANES ∗ LANEWIDTH)−1) downto 0) ; 109 rdav : out s t d l o g i c ; 110 r va l : out s t d l o g i c ; 111 rena : in s t d l o g i c ; 112 rsop : out s t d l o g i c ; 113 reop : out s t d l o g i c ; 114 r e r r : out s t d l o g i c ; 115 rmty : out s t d l o g i c v e c t o r (1 downto 0) ; 116 raddr : out s t d l o g i c v e c t o r (7 downto 0) ; 117 118 e r r c r c l o c k : out s t d l o g i c ; 119 120 r e c o n f i g r e s e t : in s t d l o g i c ; 129 121 r e c on f i g r e ad : in s t d l o g i c ; 122 r e c o n f i g w r i t e : in s t d l o g i c ; 123 r e c on f i g add r e s s : in s t d l o g i c v e c t o r (9 downto 0) ; 124 r e c on f i g w r i t e d a t a : in s t d l o g i c v e c t o r (31 downto 0) ; 125 r e c on f i g wa i t r e qu e s t : out s t d l o g i c ; 126 r e c on f i g r e adda t a : out s t d l o g i c v e c t o r (31 downto 0) 127 ) ; 128 end entity ; 129 130 architecture arch of xcv r co r e i s 131 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 132 −− Component De f i n i t i o n s 133 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 134 component a10 xcvr phy i s 135 port ( 136 r x ana l o g r e s e t : in s t d l o g i c v e c t o r (0 downto 0) := ( others => ’ 0 ’ ) ; 137 r x ca l bu sy : out s t d l o g i c v e c t o r (0 downto 0) ; 138 r x c d r r e f c l k 0 : in s t d l o g i c := ’ 0 ’ ; 139 r x c l k ou t : out s t d l o g i c v e c t o r (0 downto 0) ; 140 r x c o r e c l k i n : in s t d l o g i c v e c t o r (0 downto 0) := ( others => ’ 0 ’ ) ; 141 rx datak : out s t d l o g i c v e c t o r (3 downto 0) ; 142 r x d i g i t a l r e s e t : in s t d l o g i c v e c t o r (0 downto 0) := ( others => ’ 0 ’ ) ; 143 r x d i s p e r r : out s t d l o g i c v e c t o r (3 downto 0) ; 144 r x e r r d e t e c t : out s t d l o g i c v e c t o r (3 downto 0) ; 145 r x i s l o c k e d t od a t a : out s t d l o g i c v e c t o r (0 downto 0) ; 146 r x i s l o c k e d t o r e f : out s t d l o g i c v e c t o r (0 downto 0) ; 147 r x p a r a l l e l d a t a : out s t d l o g i c v e c t o r (31 downto 0) ; 148 r x pa t t e rnde t e c t : out s t d l o g i c v e c t o r (3 downto 0) ; 149 rx runn ingd i sp : out s t d l o g i c v e c t o r (3 downto 0) ; 150 r x s e r i a l d a t a : in s t d l o g i c v e c t o r (0 downto 0) := ( others => ’ 0 ’ ) ; 130 151 r x sync s t a tu s : out s t d l o g i c v e c t o r (3 downto 0) ; 152 t x ana l o g r e s e t : in s t d l o g i c v e c t o r (0 downto 0) := ( others => ’ 0 ’ ) ; 153 t x ca l bu sy : out s t d l o g i c v e c t o r (0 downto 0) ; 154 t x c l k ou t : out s t d l o g i c v e c t o r (0 downto 0) ; 155 t x c o r e c l k i n : in s t d l o g i c v e c t o r (0 downto 0) := ( others => ’ 0 ’ ) ; 156 tx datak : in s t d l o g i c v e c t o r (3 downto 0) := ( others => ’ 0 ’ ) ; 157 t x d i g i t a l r e s e t : in s t d l o g i c v e c t o r (0 downto 0) := ( others => ’ 0 ’ ) ; 158 t x p a r a l l e l d a t a : in s t d l o g i c v e c t o r (31 downto 0) := ( others => ’ 0 ’ ) ; 159 t x s e r i a l c l k 0 : in s t d l o g i c v e c t o r (0 downto 0) := ( others => ’ 0 ’ ) ; 160 t x s e r i a l d a t a : out s t d l o g i c v e c t o r (0 downto 0) ; 161 unu s ed r x pa r a l l e l d a t a : out s t d l o g i c v e c t o r (71 downto 0) ; 162 unu s ed t x pa r a l l e l d a t a : in s t d l o g i c v e c t o r (91 downto 0) := ( others => ’ 0 ’ ) 163 ) ; 164 end component a10 xcvr phy ; 165 166 component s l 2 c o r e IS 167 port ( 168 r x p a r a l l e l d a t a o u t : in s t d l o g i c v e c t o r (31 downto 0) ; 169 r x c o r e c l k : in s t d l o g i c ; 170 r x c t r l d e t e c t : in s t d l o g i c v e c t o r (3 downto 0) ; 171 s t a t r r p a t t d e t : in s t d l o g i c v e c t o r (3 downto 0) ; 172 e r r r r d i s p : in s t d l o g i c v e c t o r (3 downto 0) ; 173 t x c o r e c l k : in s t d l o g i c ; 174 c t r l t c f o r c e t r a i n : in s t d l o g i c ; 175 mreset n : in s t d l o g i c ; 176 rx rdp c l k : in s t d l o g i c ; 177 rxrdp ena : in s t d l o g i c ; 178 −− r e c e i v e FIFO th r e s ho l d low − un i t s in e lements 179 c t l r x r d p f t l : in s t d l o g i c v e c t o r (7 downto 0) ; 180 c t l rx rdp eopdav : in s t d l o g i c ; 131 181 tx rdp c l k : in s t d l o g i c ; 182 txrdp ena : in s t d l o g i c ; 183 txrdp sop : in s t d l o g i c ; 184 txrdp eop : in s t d l o g i c ; 185 t x rdp e r r : in s t d l o g i c ; 186 txrdp mty : in s t d l o g i c v e c t o r (1 downto 0) ; 187 txrdp dat : in s t d l o g i c v e c t o r (31 downto 0) ; 188 txrdp adr : in s t d l o g i c v e c t o r (7 downto 0) ; 189 −− t ransmi t FIFO bu f f e r t h r e s h o l d h igh 190 c t l t x r d p f t h : in s t d l o g i c v e c t o r (7 downto 0) ; 191 f l i p p o l a r i t y : out s t d l o g i c ; 192 r r e f c l k : out s t d l o g i c ; 193 s t a t r r l i n k : out s t d l o g i c ; 194 e r r r r 8 b e r r d e t : in s t d l o g i c v e c t o r (3 downto 0) ; 195 t x p a r a l l e l d a t a i n : out s t d l o g i c v e c t o r (31 downto 0) ; 196 t x c t r l e n a b l e : out s t d l o g i c v e c t o r (3 downto 0) ; 197 t x c o r e c l o c k : out s t d l o g i c ; 198 rxrdp sop : out s t d l o g i c ; 199 rxrdp eop : out s t d l o g i c ; 200 r x rdp e r r : out s t d l o g i c ; 201 rxrdp mty : out s t d l o g i c v e c t o r (1 downto 0) ; 202 rxrdp dat : out s t d l o g i c v e c t o r (31 downto 0) ; 203 rxrdp adr : out s t d l o g i c v e c t o r (7 downto 0) ; 204 rx rdp va l : out s t d l o g i c ; 205 rxrdp dav : out s t d l o g i c ; 206 −− At l an t i c FIFO bu f f e r i s empty 207 s tat rxrdp empty : out s t d l o g i c ; 208 −− At l an t i c FIFO bu f f e r ove r f l ow and data l o s t 209 e r r t c r x r dp o f lw : out s t d l o g i c ; 210 −− At l an t i c FIFO bu f f e r ove r f l ow and data l o s t 211 e r r t x r dp o f lw : out s t d l o g i c ; 212 txrdp dav : out s t d l o g i c ; 213 −− f r equency o f f s e t t o l e r anc e FIFO bu f f e r ove r f l ow 214 −− l i n k r e s t a r t s 215 e r r r r f o f f r e o f l w : out s t d l o g i c ; 216 −− f r equency o f f s e t t o l e r anc e FIFO bu f f e r under f low 217 s t a t t c f o f f r e emp t y : out s t d l o g i c ; 132 218 −− end o f bad packe t charac t e r r e c e i v ed 219 s t a t r r e bp r x : out s t d l o g i c ; 220 −− BIP−8 error d e t e c t e d in l i n k management packe t 221 e r r r r b i p 8 : out s t d l o g i c ; 222 −− CRC error de t e c t e d 223 e r r r r c r c : out s t d l o g i c ; 224 e r r r r f c r x b n e : out s t d l o g i c ; 225 e r r r r r o e r x bn e : out s t d l o g i c ; 226 −− i n v a l i d l i n k management packe t r e c e i v ed 227 e r r r r i n v a l i d lmp r x : out s t d l o g i c ; 228 −− s t a r t o f data con t r o l word miss ing 229 e r r r r m i s s i n g s t a r t d cw : out s t d l o g i c ; 230 −− s t a r t and end address f i e l d s do not match 231 e r r r r addr mismatch : out s t d l o g i c ; 232 −− p o s s i b l e c a t a s t r o ph i c error 233 e r r r r p o l r e v r e q u i r e d : out s t d l o g i c 234 ) ; 235 end component ; 236 237 component d u a l c l o c k f i f o i s 238 generic ( 239 enab l e e c c : s t r i n g := "FALSE" ; 240 i n t ended dev i c e f am i l y : s t r i n g := "Arria 10" ; 241 lpm hint : s t r i n g 242 := " DISABLE_DCFIFO_EMBEDDED_TIMING_CONSTRAINT =TRUE" ; 243 lpm numwords : natura l ; 244 lpm showahead : s t r i n g := "OFF" ; 245 lpm type : s t r i n g := "dcfifo" ; 246 lpm width : natura l ; 247 lpm widthu : natura l ; 248 ove r f l ow check ing : s t r i n g := "ON" ; 249 rd sync de layp ipe : natura l ; 250 under f l ow check ing : s t r i n g := "ON" ; 251 use eab : s t r i n g := "ON" ; 252 wrsync de layp ipe : natura l 253 ) ; 254 port ( 255 data : in s t d l o g i c v e c t o r ( lpm width − 1 downto 0) := ( others => ’X’ ) ; 256 wrreq : in s t d l o g i c := ’X’ ; 257 rdreq : in s t d l o g i c := ’X’ ; 258 wrclk : in s t d l o g i c := ’X’ ; 259 rdc lk : in s t d l o g i c := ’X’ ; 260 a c l r : in s t d l o g i c := ’ 0 ’ ; 133 261 q : out s t d l o g i c v e c t o r ( lpm width − 1 downto 0) ; 262 rdempty : out s t d l o g i c ; 263 wr f u l l : out s t d l o g i c ; 264 r d f u l l : out s t d l o g i c ; 265 wrempty : out s t d l o g i c ; 266 rdusedw : out s t d l o g i c v e c t o r ( lpm widthu − 1 downto 0) ; 267 wrusedw : out s t d l o g i c v e c t o r ( lpm widthu − 1 downto 0) ; 268 e c c s t a tu s : out s t d l o g i c v e c t o r (1 downto 0) 269 ) ; 270 end component ; 271 272 component x c v r p l l i s 273 port ( 274 p l l c a l b u s y : out s t d l o g i c ; 275 p l l l o c k e d : out s t d l o g i c ; 276 pll powerdown : in s t d l o g i c := ’ 0 ’ ; 277 p l l r e f c l k 0 : in s t d l o g i c := ’ 0 ’ ; 278 t x s e r i a l c l k : out s t d l o g i c 279 ) ; 280 end component ; 281 282 component x c v r r e s e t i s 283 port ( 284 c l o ck : in s t d l o g i c := ’ 0 ’ ; 285 p l l l o c k e d : in s t d l o g i c v e c t o r (0 downto 0) := ( others => ’ 0 ’ ) ; 286 pll powerdown : out s t d l o g i c v e c t o r (0 downto 0) ; 287 p l l s e l e c t : in s t d l o g i c v e c t o r (0 downto 0) := ( others => ’ 0 ’ ) ; 288 r e s e t : in s t d l o g i c := ’ 0 ’ ; 289 r x ana l o g r e s e t : out s t d l o g i c v e c t o r (0 downto 0) ; 290 r x ca l bu sy : in s t d l o g i c v e c t o r (0 downto 0) := ( others => ’ 0 ’ ) ; 291 r x d i g i t a l r e s e t : out s t d l o g i c v e c t o r (0 downto 0) ; 292 r x i s l o c k e d t od a t a : in s t d l o g i c v e c t o r (0 downto 0) := ( others => ’ 0 ’ ) ; 293 rx ready : out s t d l o g i c v e c t o r (0 downto 0) ; 294 t x ana l o g r e s e t : out s t d l o g i c v e c t o r (0 downto 0) ; 134 295 t x ca l bu sy : in s t d l o g i c v e c t o r (0 downto 0) := ( others => ’ 0 ’ ) ; 296 t x d i g i t a l r e s e t : out s t d l o g i c v e c t o r (0 downto 0) ; 297 tx ready : out s t d l o g i c v e c t o r (0 downto 0) 298 ) ; 299 end component ; 300 301 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 302 −− S igna l De f i n i t i o n s 303 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 304 signal ONES : s t d l o g i c v e c t o r ( NUMBER OF LANES−1 downto 0) ; 305 306 signal r x f r e q l o c k ed : s t d l o g i c v e c t o r ( NUMBER OF LANES−1 downto 0) ; 307 308 signal c t l r x r d p f t l : s t d l o g i c v e c t o r (7 downto 0) ; 309 signal c t l t x r d p f t h : s t d l o g i c v e c t o r (7 downto 0) ; 310 signal s t a t r r l i n k m i n 2 : s t d l o g i c ; 311 signal s t a t r r l i n k m i n 1 : s t d l o g i c ; 312 313 signal s tat rxrdp empty : s t d l o g i c ; 314 signal e r r t c r x r dp o f lw : s t d l o g i c ; 315 signal e r r t x r dp o f lw : s t d l o g i c ; 316 signal e r r r r f o f f r e o f l w : s t d l o g i c ; 317 signal s t a t t c f o f f r e emp t y : s t d l o g i c ; 318 signal s t a t r r e bp r x : s t d l o g i c ; 319 signal e r r r r b i p 8 : s t d l o g i c ; 320 signal e r r r r f c r x b n e : s t d l o g i c ; 321 signal e r r r r r o e r x bn e : s t d l o g i c ; 322 signal e r r r r i n v a l i d lmp r x : s t d l o g i c ; 323 signal e r r r r m i s s i n g s t a r t d cw : s t d l o g i c ; 324 signal e r r r r addr mismatch : s t d l o g i c ; 325 signal e r r r r c r c : s t d l o g i c ; 326 327 signal r x p a r a l l e l d a t a : s t d l o g i c v e c t o r ( ( NUMBER OF LANES ∗ LANEWIDTH)−1 downto 0) ; 328 signal t x p a r a l l e l d a t a : s t d l o g i c v e c t o r ( ( NUMBER OF LANES ∗ LANEWIDTH)−1 downto 0) ; 329 signal tx datak : s t d l o g i c v e c t o r (3 downto 0) ; 330 signal rx datak : s t d l o g i c v e c t o r (3 downto 0) ; 331 135 332 signal r x c o r e c l k : s t d l o g i c v e c t o r ( NUMBER OF LANES − 1 downto 0) ; 333 signal t x c o r e c l k : s t d l o g i c v e c t o r ( NUMBER OF LANES − 1 downto 0) ; 334 signal r x c l k ou t : s t d l o g i c v e c t o r ( NUMBER OF LANES − 1 downto 0) ; 335 signal t x c l k ou t : s t d l o g i c v e c t o r ( NUMBER OF LANES − 1 downto 0) ; 336 signal t x c o r e c l o c k : s t d l o g i c ; 337 signal r r e f c l k : s t d l o g i c ; 338 339 signal r x d i s p e r r : s t d l o g i c v e c t o r (3 downto 0) ; 340 signal r x e r r d e t e c t : s t d l o g i c v e c t o r (3 downto 0) ; 341 signal r x pa t t e rnde t e c t : s t d l o g i c v e c t o r (3 downto 0) ; 342 343 signal tx ca l busy combined : s t d l o g i c v e c t o r (0 downto 0) ; 344 signal t x s e r i a l c l k p l l : s t d l o g i c ; 345 signal pll powerdown : s t d l o g i c ; 346 signal p l l c a l b u s y : s t d l o g i c ; 347 signal p l l l o c k e d : s t d l o g i c ; 348 signal t x s e r i a l c l k : s t d l o g i c v e c t o r ( NUMBER OF LANES−1 downto 0) ; 349 350 signal t x ca l bu sy : s t d l o g i c v e c t o r (0 downto 0) ; 351 signal t x r e ady i : s t d l o g i c v e c t o r (0 downto 0) ; 352 signal r x ca l bu sy : s t d l o g i c v e c t o r (0 downto 0) ; 353 signal r x r e ady i : s t d l o g i c v e c t o r (0 downto 0) ; 354 signal r x a n a l o g r e s e t i : s t d l o g i c v e c t o r (0 downto 0) ; 355 signal r x d i g i t a l r e s e t i : s t d l o g i c v e c t o r (0 downto 0) ; 356 signal t x a n a l o g r e s e t i : s t d l o g i c v e c t o r (0 downto 0) ; 357 signal t x d i g i t a l r e s e t i : s t d l o g i c v e c t o r (0 downto 0) ; 358 359 360 signal w req : s t d l o g i c ; 361 signal r r e q : s t d l o g i c ; 136 362 signal w fu l l : s t d l o g i c ; 363 signal r empty : s t d l o g i c ; 364 signal e r r 8 b l o c k : s t d l o g i c ; 365 signal e r r addr mismatch lock : s t d l o g i c ; 366 signal e r r b i p 8 l o c k : s t d l o g i c ; 367 signal e r r i n v a l i d lmp r x l o c k : s t d l o g i c ; 368 signal e r r m i s s i n g l o c k : s t d l o g i c ; 369 signal e r r a r r a y : s t d l o g i c v e c t o r (4 downto 0) ; 370 371 begin 372 373 generate ALTGX clocks : 374 for i in 0 to NUMBER OF LANES−1 generate 375 r x c o r e c l k ( i ) <= rx c l kou t (0 ) ; 376 t x c o r e c l k ( i ) <= tx c l kou t (0 ) ; 377 tx ca l busy combined ( i ) <= tx ca l bu sy ( i ) or p l l c a l b u s y ; 378 end generate ; 379 380 g e n e r a t e x c v r s e r i a l c l o c k s 1 : 381 for i in 0 to NUMBER OF LANES−1 generate 382 t x s e r i a l c l k ( i ) <= t x s e r i a l c l k p l l ; 383 end generate ; 384 385 u0 : component a10 xcvr phy 386 port map( 387 r x ana l o g r e s e t => r x an a l o g r e s e t i , 388 r x ca l bu sy => rx ca l busy , 389 r x c d r r e f c l k 0 => x cv r r e f c l k , 390 r x c l k ou t => rx c lkout , 391 r x c o r e c l k i n => r x co r e c l k , 392 rx datak => rx datak , 393 r x d i g i t a l r e s e t => r x d i g i t a l r e s e t i , 394 r x d i s p e r r => r x d i sp e r r , 395 r x e r r d e t e c t => r x e r r d e t e c t , 396 r x i s l o c k e d t od a t a => r x f r eq l o ck ed , 397 r x i s l o c k e d t o r e f => open , 398 r x p a r a l l e l d a t a => r x p a r a l l e l d a t a , 399 rx runn ingd i sp => open , 400 r x pa t t e rnde t e c t => rx pa t t e rnde t e c t , 401 r x s e r i a l d a t a (0 ) => r x s e r i a l d a t a , 402 r x sync s t a tu s => open , 403 t x ana l o g r e s e t => t x an a l o g r e s e t i , 404 t x ca l bu sy => tx ca l busy , 405 t x c l k ou t => tx c lkout , 406 t x c o r e c l k i n => t x co r e c l k , 137 407 tx datak => tx datak , 408 t x d i g i t a l r e s e t => t x d i g i t a l r e s e t i , 409 t x p a r a l l e l d a t a => t x p a r a l l e l d a t a , 410 t x s e r i a l c l k 0 => t x s e r i a l c l k , 411 t x s e r i a l d a t a (0 ) => t x s e r i a l d a t a , 412 unu s ed r x pa r a l l e l d a t a => open , 413 unu s ed t x pa r a l l e l d a t a => ( others => ’ 0 ’ ) 414 ) ; 415 416 u1 : s l 2 c o r e 417 port map( 418 r x p a r a l l e l d a t a o u t => r x p a r a l l e l d a t a , 419 r x c o r e c l k => r x c o r e c l k (0 ) , 420 r x c t r l d e t e c t => rx datak , 421 s t a t r r p a t t d e t => rx pa t t e rnde t e c t , 422 e r r r r d i s p => r x d i sp e r r , 423 t x c o r e c l k => t x c o r e c l k (0 ) , 424 c t r l t c f o r c e t r a i n => ’ 0 ’ , 425 mreset n => r e s e t n , 426 rx rdp c l k => c lkdata , 427 rxrdp ena => rena , 428 c t l r x r d p f t l => c t l r x r d p f t l , 429 c t l rx rdp eopdav => ’ 0 ’ , 430 tx rdp c l k => c lkdata , 431 txrdp ena => tena , 432 txrdp sop => tsop , 433 txrdp eop => teop , 434 t x rdp e r r => t e r r , 435 txrdp mty => tmty , 436 txrdp dat => tdat , 437 txrdp adr => taddr , 438 c t l t x r d p f t h => c t l t x r dp f t h , 439 f l i p p o l a r i t y => open , 440 r r e f c l k => r r e f c l k , 441 s t a t r r l i n k => s t a t r r l i n k m in2 , 442 e r r r r 8 b e r r d e t => r x e r r d e t e c t , 443 t x p a r a l l e l d a t a i n => t x p a r a l l e l d a t a , 444 t x c t r l e n a b l e => tx datak , 445 t x c o r e c l o c k => t x co r e c l o ck , 446 rxrdp sop => rsop , 447 rxrdp eop => reop , 448 r x rdp e r r => r e r r , 449 rxrdp mty => rmty , 450 rxrdp dat => rdat , 451 rxrdp adr => raddr , 452 rx rdp va l => rva l , 453 rxrdp dav => rdav , 138 454 s tat rxrdp empty => stat rxrdp empty , 455 e r r t c r x r dp o f lw => e r r t c r x r dp o f lw , 456 e r r t x r dp o f lw => e r r t x rdp o f lw , 457 txrdp dav => tdav , 458 e r r r r f o f f r e o f l w => e r r r r f o f f r e o f l w , 459 s t a t t c f o f f r e emp t y => s t a t t c f o f f r e emp t y , 460 s t a t r r e bp r x => s t a t r r ebp rx , 461 e r r r r b i p 8 => e r r r r b i p 8 , 462 e r r r r c r c => e r r r r c r c , 463 e r r r r f c r x b n e => e r r r r f c r x bn e , 464 e r r r r r o e r x bn e => e r r r r r o e r x bn e , 465 e r r r r i n v a l i d lmp r x => e r r r r i n v a l i d lmp r x , 466 e r r r r m i s s i n g s t a r t d cw => e r r r r m i s s i n g s t a r t d cw , 467 e r r r r addr mismatch => er r r r addr mismatch , 468 e r r r r p o l r e v r e q u i r e d => open 469 ) ; 470 471 u2 : x c v r p l l 472 port map( 473 p l l c a l b u s y => p l l c a l bu s y , 474 p l l l o c k e d => p l l l o c k ed , 475 pll powerdown => pll powerdown , 476 p l l r e f c l k 0 => x cv r r e f c l k , 477 t x s e r i a l c l k => t x s e r i a l c l k p l l 478 ) ; 479 480 u3 : x c v r r e s e t 481 port map( 482 c l o ck => clk 50MHz , 483 p l l l o c k e d (0 ) => p l l l o c k ed , 484 pll powerdown (0) => pll powerdown , 485 p l l s e l e c t => ( others => ’ 0 ’ ) , 486 r e s e t => r e s e t , 487 r x ana l o g r e s e t => r x an a l o g r e s e t i , 488 r x ca l bu sy => rx ca l busy , 489 r x d i g i t a l r e s e t => r x d i g i t a l r e s e t i , 490 r x i s l o c k e d t od a t a => r x f r eq l o ck ed , 491 rx ready => r x r eady i , 492 t x ana l o g r e s e t => t x an a l o g r e s e t i , 493 t x ca l bu sy => tx ca l busy combined , 494 t x d i g i t a l r e s e t => t x d i g i t a l r e s e t i , 495 tx ready => t x r e ady i 496 ) ; 497 498 f i f o l o c k : d u a l c l o c k f i f o 499 generic map( 500 lpm numwords => 32 , 139 501 lpm width => 5 , 502 lpm widthu => 5 , 503 rd sync de layp ipe => 3 , 504 wrsync de layp ipe => 3 505 ) 506 port map( 507 data => e r r r r b i p 8 & e r r r r c r c & e r r r r i n v a l i d lmp r x & 508 e r r r r m i s s i n g s t a r t d cw & err r r addr mismatch , 509 wrreq => w req , 510 rdreq => r r eq , 511 wrclk => r r e f c l k , 512 rdc lk => c lkdata , 513 a c l r => ’ 0 ’ , 514 q => e r r a r ray , 515 rdempty => r empty , 516 wr f u l l => w fu l l , 517 r d f u l l => open , 518 wrempty => open 519 ) ; 520 −−Ava i l a b l e f o r f u t u r e cons i d e ra t i on 521 e r r b i p 8 l o c k <= er r a r r a y (4 ) ; 522 e r r c r c l o c k <= er r a r r a y (3 ) ; 523 e r r i n v a l i d lmp r x l o c k <= er r a r r a y (2 ) ; 524 e r r m i s s i n g l o c k <= er r a r r a y (1 ) ; 525 e r r addr mismatch lock <= er r a r r a y (0 ) ; 526 527 process ( r r e f c l k ) 528 begin 529 i f ( r i s i n g e d g e ( r r e f c l k ) ) then 530 i f ( w f u l l = ’0 ’ ) then 531 w req <= ’1 ’ ; 532 else 533 w req <= ’0 ’ ; 534 end i f ; 535 end i f ; 536 end process ; 537 538 process ( c lkdata ) 539 begin 540 i f ( r i s i n g e d g e ( c lkdata ) ) then 541 i f ( r empty = ’0 ’ ) then 542 r r e q <= ’1 ’ ; 543 else 544 r r e q <= ’0 ’ ; 545 end i f ; 140 546 end i f ; 547 end process ; 548 549 550 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 551 −− Generate Zeroes and Ones 552 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 553 generate ZEROES and ONES : 554 for i in 0 to NUMBER OF LANES−1 generate 555 ONES( I ) <= ’1 ’ ; 556 end generate ; 557 558 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 559 −− Generate t x r eady and rx ready 560 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 561 tx ready <= ’1 ’ when t x r e ady i = ONES else ’ 0 ’ ; 562 rx ready <= ’1 ’ when r x r e ady i = ONES else ’ 0 ’ ; 563 564 c t l r x r d p f t l <= "00010010" ; −− Set a r b i t r a r i l y ( check s imu la t i on ) 565 c t l t x r d p f t h <= "01110000" ; −− Set a r b i t r a r i l y ( check s imu la t i on ) 566 567 −−r e g i s t e r f o r l i n k s t a t u s 568 process ( clk 50MHz , r e s e t ) 569 begin 570 i f ( r e s e t = ’1 ’ ) then 571 s t a t r r l i n k m i n 1 <= ’0 ’ ; 572 s t a t r r l i n k <= ’0 ’ ; 573 e l s i f ( r i s i n g e d g e ( clk 50MHz ) ) then 574 s t a t r r l i n k m i n 1 <= s t a t r r l i n k m i n 2 ; 575 s t a t r r l i n k <= s t a t r r l i n k m i n 1 ; 576 end i f ; 577 end process ; 578 end architecture ; 141 APPENDIX C MATLAB CODE 142 1 % −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 2 % s o r t t b .m 3 % Testbench f o r s o r t i n g component −− s o r t in two c y c l e s 4 % −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 5 clc ; 6 7 s o r t hd l = hdlcos im top ; 8 NUM TRIALS = 50 ; 9 10 for k = 1 :NUM TRIALS 11 %Bui ld a l l the random inpu t s 12 in1 = s i n g l e (randn ( ) ) ; 13 in2 = s i n g l e (randn ( ) ) ; 14 in3 = s i n g l e (randn ( ) ) ; 15 in4 = s i n g l e (randn ( ) ) ; 16 in5 = s i n g l e (randn ( ) ) ; 17 in6 = s i n g l e (randn ( ) ) ; 18 in7 = s i n g l e (randn ( ) ) ; 19 in8 = s i n g l e (randn ( ) ) ; 20 in9 = s i n g l e (randn ( ) ) ; 21 in10 = s i n g l e (randn ( ) ) ; 22 in11 = s i n g l e (randn ( ) ) ; 23 in12 = s i n g l e (randn ( ) ) ; 24 in13 = s i n g l e (randn ( ) ) ; 25 in14 = s i n g l e (randn ( ) ) ; 26 in15 = s i n g l e (randn ( ) ) ; 27 in16 = s i n g l e (randn ( ) ) ; 28 in17 = s i n g l e (randn ( ) ) ; 29 in18 = s i n g l e (randn ( ) ) ; 30 in19 = s i n g l e (randn ( ) ) ; 31 in20 = s i n g l e (randn ( ) ) ; 32 33 i n pu t h i s t o r y {k} = [ in20 in19 in18 in17 in16 in15 in14 in13 in12 in11 in10 in9 in8 in7 in6 in5 in4 in3 in2 in1 ] ; 34 35 %input in t o system 36 [ out20 out19 out18 out17 out16 out15 out14 out13 out12 out11 out10 out9 out8 out7 out6 out5 out4 out3 out2 out1 . . . 37 ind1 ind2 ind3 ind4 ind5 ind6 ind7 ind8 ind9 ind10 ind11 ind12 ind13 ind14 ind15 ind16 ind17 ind18 ind19 ind20 ] = . . . 143 38 s tep ( s o r t hd l , in1 , in2 , in3 , in4 , in5 , in6 , in7 , in8 , in9 , in10 , in11 , in12 , in13 , in14 , in15 , in16 , in17 , in18 , in19 , in20 ) ; 39 40 ou tput h i s t o ry {k} = [ out20 out19 out18 out17 out16 out15 out14 out13 out12 out11 out10 out9 out8 out7 out6 out5 out4 out3 out2 out1 ] ; 41 ou tpu t i nd i c e s {k} = [ ind1 ind2 ind3 ind4 ind5 ind6 ind7 ind8 ind9 ind10 ind11 ind12 ind13 ind14 ind15 ind16 ind17 ind18 ind19 ind20 ] ; 42 end ; 43 44 l a t ency = 2 ; 45 for k = 1 :NUM TRIALS−l a t ency 46 o r i g i n a l = i npu t h i s t o r y {k} 47 % sor t ed = ou t p u t h i s t o r y {k+l a t ency } 48 so r t ed (k , : ) = output h i s t o ry {k+la t ency } 49 temp = outpu t i nd i c e s {k+la t ency } ; 50 % ind i c e s = doub le ( temp ) 51 i n d i c e s (k , : ) = double ( temp) ; 52 53 %compute s o r t in MATLAB 54 [ a c tua l (k , : ) , a c tua l i ndex (k , : ) ] = sort ( o r i g i n a l ) ; 55 56 v a l d i f f = actual−so r t ed ; 57 58 i n d d i f f = actua l index−i n d i c e s ; 59 end ; 60 61 T = tab l e ( sorted , a c tua l ) ; 62 wr i t e t ab l e (T, ’sorted.xlsx’ ,’Range’ ,’B1’ ) ; 63 T = tab l e ( i nd i c e s , a c tua l i ndex ) ; 64 wr i t e t ab l e (T, ’indices.xlsx’ ,’Range’ ,’B1’ ) ; 65 T = tab l e ( v a l d i f f , i n d d i f f ) ; 66 wr i t e t ab l e (T, ’errors.xlsx’ ,’Range’ ,’B1’ ) ; 1 % −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 2 % camera ra t i o s t b .m 3 % Testbench f o r v e r i f i c a t i o n o f co r r e c t r a t i o c a l c u l a t i o n s 4 % −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 5 clc ; 6 7 s o r t hd l = hd l co s im camera r e l a t i on s ; 144 8 l i n e s = 10 ;%648; 9 hyp e r l i n e s = 160 ; 10 l i n e s c a n p i x e l s = 8000 ; %1536; 11 hype r p i x e l s = 1024 ; %200; 12 p i x e l r a t i o = l i n e s c a n p i x e l s / hype r p i x e l s ; %7.8125 13 l i n e r a t i o = l i n e s / hyp e r l i n e s ; %4.05 14 15 o f f s e t = f i ( 32 , 1 , 13 , 0 ) ; 16 p i x r a t i o = f i ( (1/ p i x e l r a t i o ) , 0 , 32 ,32 ) ; 17 l i n e r a t = f i ( (1/ l i n e r a t i o ) , 0 , 32 ,32 ) ; 18 s tep ( s o r t hd l , o f f s e t , p i x r a t i o , l i n e r a t , f i ( 0 , 0 , 32 , 0 ) , f i ( 0 , 0 , 13 , 0 ) , f i ( 0 , 0 , 13 , 0 ) ) ; 19 20 for k = 1 : l i n e s 21 for j = 0 : l i n e s c a n p i x e l s −1 22 l ine = f i (k , 0 , 3 2 , 0 ) ; 23 s t a r t = f i ( j , 0 , 1 3 , 0 ) ; 24 end pix = f i ( j , 0 , 1 3 , 0 ) ; 25 26 i n pu t h i s t o r y { j+1} = [ l ine s t a r t end pix ] ; 27 [ r e g l i n e , r e g s t a r t , r eg endp ix i gnore ] = step ( s o r t hd l , o f f s e t , p i x r a t i o , l i n e r a t , l ine , s t a r t , end pix ) ; 28 ou tput h i s t o ry { j+1} = [ r e g l i n e r e g s t a r t reg endp ix ] ; 29 ou tpu t f l a g { j+1} = ignore ; 30 end ; 31 end ; 32 33 e r r o r s = 0 ; 34 sim = zeros ( l i n e s c a n p i x e l s , 2 ) ; 35 ac tua l = zeros ( l i n e s c a n p i x e l s , 2 ) ; 36 l a t ency = 1 ; 37 for k = 1 : l i n e s 38 for j = 0 : l i n e s c a n p i x e l s −1−l a t ency 39 o r i g i n a l = i npu t h i s t o r y { j +1}; 40 computed = output h i s t o ry { j+1+la t ency } ; 41 inp = f i ( o r i g i n a l (2 ) , 0 , 13 , 0 ) ; 42 comp = f i ( computed (2 ) , 0 , 10 , 0 ) ; 43 44 i g n o r e f l a g ( j +1) = outpu t f l a g { j+1+la t ency } ; 45 46 sim ( j +1 , : ) = [ inp comp ] ; 47 l i n e p i x = f i ( j , 0 , 1 3 , 0 ) ; 48 act = f loor ( ( l i n e p i x+o f f s e t ) ∗ p i x r a t i o ) ; 49 ac tua l ( j +1 , : ) = [ l i n e p i x act ] ; 50 51 i f act ˜= comp 52 e r r o r s = e r r o r s + 1 ; 145 53 end ; 54 end ; 55 end ; 56 57 plot ( sim ( : , 1 ) ’ , sim ( : , 2 ) ’ , ’r’ ) ; 58 hold on ; 59 plot ( ac tua l ( : , 1 ) ’ , a c tua l ( : , 2 ) ’ , ’*’ ) ; 60 61 save t e s t 62 clear 63 load t e s t 1 % −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 2 % o b j e c t s t b .m 3 % Testbench f o r c l a s s i f i c a t i o n o f o b j e c t s . U t i l i z e s two o b j e c t s . 4 % −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 5 i f (˜ exist (’class_data’ ,’var’ ) ) 6 [ Img , s t a r t p i x , end pix ] = edge bu i l d e r (’objects_simple.png’ ) ; 7 for k = 1:50 8 data ( : , : , k ) = x l s r ead (’results_luckycharm.xlsx’ , k+20,’B2: F65’ ) ; 9 end ; 10 c l a s s d a t a = s i n g l e ( data ) ; 11 end ; 12 13 clc ; 14 c l a s s e s = 5 ; 15 l i n e s = 192∗10;%648; 16 hyp e r l i n e s = 10 ; 17 l i n e s c a n p i x e l s = 1536 ; 18 hype r p i x e l s = 64 ; 19 p i x e l r a t i o = l i n e s c a n p i x e l s / hype r p i x e l s ; %24 20 l i n e r a t i o = l i n e s / hyp e r l i n e s ; %192 21 num objects = 2 ; 22 23 ob j e c t s hd l = hdlcos im top ; % Set up s imu la t i on o b j e c t 24 25 ob j e c t = f i ( 0 , 0 , 54 , 0 ) ; 26 o f f s e t = f i ( 0 , 1 , 13 , 0 ) ; 27 p i x r a t i o = f i ( (1/ p i x e l r a t i o ) , 0 , 32 ,32 ) ; 28 l i n e r a t = f i ( (1/ l i n e r a t i o ) , 0 , 32 ,32 ) ; 29 30 f r ame f l a g = f i ( 1 , 0 , 1 , 0 ) ; 146 31 pixel num = f i ( 0 , 0 , 10 , 0 ) ; 32 c u r r e n t p i x e l = 0 ; 33 da ta t r a cke r = 1 ; 34 new = 0 ; 35 in1 = f i ( 0 , 0 , 32 , 0 ) ; 36 in2 = f i ( 0 , 0 , 32 , 0 ) ; 37 in3 = f i ( 0 , 0 , 32 , 0 ) ; 38 in4 = f i ( 0 , 0 , 32 , 0 ) ; 39 in5 = f i ( 0 , 0 , 32 , 0 ) ; 40 c u r r e n t o b j l i n e = 345 ;%271; 41 for K = 0 : hyp e r l i n e s %10 42 for M = 0: hype r p i x e l s %64 43 i f ( c u r r e n t p i x e l == 64) 44 da ta t r a cke r = data t r a cke r + 1 ; 45 c u r r e n t p i x e l = 0 ; 46 c u r r e n t o b j l i n e = c u r r e n t o b j l i n e + 1 ; 47 end ; 48 pixel num = f i ( cu r r en t p i x e l , 0 , 8 , 0 ) ; 49 in1 . hex = num2hex( c l a s s d a t a ( c u r r e n t p i x e l +1 ,1 , da ta t r a cke r ) ) ; 50 in2 . hex = num2hex( c l a s s d a t a ( c u r r e n t p i x e l +1 ,2 , da ta t r a cke r ) ) ; 51 in3 . hex = num2hex( c l a s s d a t a ( c u r r e n t p i x e l +1 ,3 , da ta t r a cke r ) ) ; 52 in4 . hex = num2hex( c l a s s d a t a ( c u r r e n t p i x e l +1 ,4 , da ta t r a cke r ) ) ; 53 in5 . hex = num2hex( c l a s s d a t a ( c u r r e n t p i x e l +1 ,5 , da ta t r a cke r ) ) ; 54 c u r r e n t p i x e l = cu r r e n t p i x e l + 1 ; 55 56 new re su l t s = f i ( 1 , 0 , 1 , 0 ) ; 57 for J = 1 : ( l i n e r a t i o / hype r p i x e l s )%5 l i n e s 58 for X = 1 : num objects 59 i f J ˜= 1 | | X ˜= 1 60 new re su l t s = f i ( 0 , 0 , 1 , 0 ) ; 61 end ; 62 ob j e c t = b i t conca t ( f i (K, 0 , 3 2 , 0 ) , f i (X, 0 , 6 , 0 ) , f i ( f loor ( p i x r a t i o ∗ s t a r t p i x ( c u r r e n t o b j l i n e ,X) ) , 0 , 8 , 0 ) , f i ( f loor ( p i x r a t i o ∗ end pix ( c u r r e n t o b j l i n e ,X) ) , 0 , 8 , 0 ) ) ; 63 % Run data in t o system 64 [ out1 , out2 , out3 , out4 , out5 , objectnum ] = step ( ob j e c t s hd l , ob ject , new resu l t s , pixel num , in1 , in2 , in3 , in4 , in5 ) ; 65 end ; 66 end ; 67 end ; 147 68 end ; 69 70 save t e s t . mat 71 clear ; 72 load t e s t . mat 1 % −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 2 % norma l i z e t b .m 3 % Testbench f o r normal ize component 4 % −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 5 % Sta r t i n i t i a l i z a t i o n 6 i f (˜ exist (’data’ , ’var’ ) ) 7 load (’data.mat’ ) ; 8 end ; 9 10 clc ; 11 i t e r a t i o n s = 1 ; 12 rows = 64 ; 13 columns = 64 ; 14 product hd l = hdlcos im top ; % Set up s imu la t i on o b j e c t 15 16 for K = 1 : i t e r a t i o n s 17 for J = 0 : columns − 1 18 for I = 0 : rows − 1 19 data in = data ( I+1, J+1) ; 20 darkin = dark ( I+1, J+1) ; 21 l i g h t i n = l i g h t I ( I+1, J+1) ; 22 meanin = means (1 , J+1) ; 23 s tddev in = stddevI (1 , J+1) ; 24 i n pu t h i s t o r y { I+1,J+1} = [ datain , darkin , l i g h t i n , meanin , s tddev in ] ; 25 % Run data in t o system 26 [ normal ized ] = step ( normal i ze hd l , datain , darkin , l i g h t i n , meanin , s tddev in ) ; 27 ou tput h i s t o ry { I+1,J+1} = [ normal ized ] ; 28 end ; 29 end ; 30 end ; 31 32 % la t ency = 4 ( su b t r a c t i on ) + 1 ( comparison ) + 3 (mult ) + 1 ( comparison ) + 33 % 4 ( su b t r a c t i on ) + 3 (mult ) 34 l a t ency = 16 ; 148 35 for I = 1 : rows+columns−l a t ency 36 inputs = inpu t h i s t o r y { I } 37 normal ized ( I ) = output h i s t o ry { I+la t ency } 38 39 ac tua l ( I ) = normal ize ( inputs (1 ) , inputs (2 ) , inputs (3 ) , inputs (4 ) , inputs (5 ) ) 40 end ; 41 42 % Output r e s u l t s to f i l e 43 T = tab l e ( normalized ’ , actua l ’ ) ; 44 wr i t e t ab l e (T, ’normalize.xlsx’ , ’Range’ , ’B2’ , ’ WriteVariableNames’ , f a l s e ) ; 1 % −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 2 % inne r p r oduc t t b .m 3 % Testbench f o r inner produc t component 4 % −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 5 % Sta r t i n i t i a l i z a t i o n 6 i f (˜ exist (’data’ , ’var’ ) ) 7 load (’data.mat’ ) ; 8 end ; 9 10 clc ; 11 i t e r a t i o n s = 1 ; 12 rows = 200 ; 13 columns = 1 ; 14 product hd l = hdlcos im top ; % Set up s imu la t i on o b j e c t 15 M = 0; 16 for K = 1 : i t e r a t i o n s 17 for J = 0 : columns − 1 18 for I = 0 : 4 : ( rows ) − 1 19 norm1 = normal ize ( data ( I+1,J+1) , dark ( I+1,J+1) , l i g h t I ( I+1,J+1) , means (1 , I+1) , s tddevI (1 , I+1) ) ; 20 norm2 = normal ize ( data ( I+2,J+1) , dark ( I+2,J+1) , l i g h t I ( I+2,J+1) , means (1 , I+2) , s tddevI (1 , I+2) ) ; 21 norm3 = normal ize ( data ( I+3,J+1) , dark ( I+3,J+1) , l i g h t I ( I+3,J+1) , means (1 , I+3) , s tddevI (1 , I+3) ) ; 22 norm4 = normal ize ( data ( I+4,J+1) , dark ( I+4,J+1) , l i g h t I ( I+4,J+1) , means (1 , I+4) , s tddevI (1 , I+4) ) ; 23 c l a s s 1 = c l a s s (1 , I+1) ; 24 c l a s s 2 = c l a s s (1 , I+2) ; 25 c l a s s 3 = c l a s s (1 , I+3) ; 26 c l a s s 4 = c l a s s (1 , I+4) ; 149 27 i n pu t h i s t o r y {M+1,J+1} = [ norm1 , norm2 , norm3 , norm4 , c l a s s 1 , c l a s s 2 , c l a s s 3 , c l a s s 4 ] ; 28 % Run data in t o system 29 [ p a r t i a l 1 , pa r t i a l 2 , pa r t i a l 3 , pa r t i a l 4 , sum out ] = step ( product hdl , norm1 , norm2 , norm3 , norm4 , c l a s s 1 , c l a s s 2 , c l a s s 3 , c l a s s 4 ) ; 30 ou tput h i s t o ry {M+1,J+1} = [ pa r t i a l 1 , pa r t i a l 2 , pa r t i a l 3 , pa r t i a l 4 , sum out ] ; 31 M=M+1; 32 end ; 33 end ; 34 end ; 35 36 % la t ency = 5 ( inner product ) 37 % la t ency = 21 ( channel sum) 38 prev ious1 = s i n g l e (0 ) ; 39 prev ious2 = s i n g l e (0 ) ; 40 prev ious3 = s i n g l e (0 ) ; 41 prev ious4 = s i n g l e (0 ) ; 42 l a t ency = 14 ; %26; 43 for J=0: columns−1 44 for I = 0 : ( rows/4−1)−l a t ency 45 K=4∗ I ; 46 inputs = inpu t h i s t o r y { I+1,J+1}; 47 sim = output h i s t o ry { I+1+latency , J+1} 48 49 norms = [ normal ize ( data (K+1,J+1) , dark (K+1,J+1) , l i g h t I (K +1,J+1) , means (1 ,K+1) , s tddevI (1 ,K+1) ) . . . 50 normal ize ( data (K+2,J+1) , dark (K+2,J+1) , l i g h t I (K +2,J+1) , means (1 ,K+2) , s tddevI (1 ,K+2) ) . . . 51 normal ize ( data (K+3,J+1) , dark (K+3,J+1) , l i g h t I (K +3,J+1) , means (1 ,K+3) , s tddevI (1 ,K+3) ) . . . 52 normal ize ( data (K+4,J+1) , dark (K+4,J+1) , l i g h t I (K +4,J+1) , means (1 ,K+4) , s tddevI (1 ,K+4) ) ] ; 53 actual sum1 = inner product ( inputs (1 ) , inputs (5 ) , prev ious1 ) ; 54 actual sum2 = inner product ( inputs (2 ) , inputs (6 ) , prev ious2 ) ; 55 actual sum3 = inner product ( inputs (3 ) , inputs (7 ) , prev ious3 ) ; 56 actual sum4 = inner product ( inputs (4 ) , inputs (8 ) , prev ious4 ) ; 57 prev ious1 = actual sum1 ; 58 prev ious2 = actual sum2 ; 59 prev ious3 = actual sum3 ; 60 prev ious4 = actual sum4 ; 61 150 62 tota l sum = actual sum1 + actual sum2 + actual sum3 + actual sum4 63 end ; 64 end ; 1 % −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− −−−−−−−−−− 2 % re g r e s s i o n t b .m 3 % Testbench f o r r e g r e s s i on system 4 % −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 5 % Sta r t i n i t i a l i z a t i o n 6 i f (˜ exist (’datafi’ , ’var’ ) ) 7 load (’test_data.mat’ ) ; 8 end ; 9 10 clc ; 11 l i n e s = 85 ; 12 bands = 160 ; 13 samples = 110 ; %1024; 14 c l a s s e s = 5 ; %20; 15 16 r e g r e s s i o n hd l = hdlcos im top ; % Set up s imu la t i on o b j e c t 17 18 in = f i ( 0 , 0 , 98 , 0 ) ; 19 20 %wr i t e i n t e r c e p t s 21 for K = 1 : c l a s s e s 22 address = b i t conca t ( f i ( 1 , 0 , 14 , 0 ) , f i (K, 0 , 1 0 , 0 ) , f i ( 0 , 0 , 8 , 0 ) ) ; 23 data = f i ( 0 , 0 , 32 , 0 ) ; 24 data . hex = num2hex( c l a s s (1 ,K) ) ; 25 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , address , data ) ; 26 end ; 27 28 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 32 , 0 ) , f i ( 0 , 0 , 32 , 0 ) ) ; 29 30 for K = 1 : c l a s s e s 31 for J = 1 : ( bands /5) ∗8 32 % Address genera t ion 33 address = b i t conca t ( f i ( 1 , 0 , 14 , 0 ) , f i (K, 0 , 1 0 , 0 ) , f i ( J , 0 , 8 , 0 ) ) ; 34 data = f i ( 0 , 0 , 32 , 0 ) ; 35 data . hex = num2hex( c l a s s ( J+1,K) ) ; 151 36 37 % Write c l a s s e s 38 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , address , data ) ; 39 end ; 40 end ; 41 42 %Empty c l o c k c y c l e s 43 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 32 , 0 ) , f i ( 0 , 0 , 32 , 0 ) ) ; 44 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 32 , 0 ) , f i ( 0 , 0 , 32 , 0 ) ) ; 45 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 32 , 0 ) , f i ( 0 , 0 , 32 , 0 ) ) ; 46 47 %WRITE MEANS 48 for J = 1 : ( bands /5) ∗8 49 % Address genera t ion 50 address = b i t conca t ( f i ( 1 , 0 , 22 , 0 ) , f i ( J−1 ,0 ,10 ,0) ) ; 51 data = f i ( 0 , 0 , 32 , 0 ) ; 52 data . hex = num2hex(means (1 , J ) ) ; 53 54 % Write means 55 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , address , data ) ; 56 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , address , data ) ; 57 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , address , data ) ; 58 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , address , data ) ; 59 end ; 60 61 % Empty c l o c k c y c l e s 62 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 32 , 0 ) , f i ( 0 , 0 , 32 , 0 ) ) ; 63 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 32 , 0 ) , f i ( 0 , 0 , 32 , 0 ) ) ; 64 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 32 , 0 ) , f i ( 0 , 0 , 32 , 0 ) ) ; 65 66 %WRITE STDDEVI 67 for J = 1 : ( bands /5) ∗8 68 % Address genera t ion 69 address = b i t conca t ( f i ( 4 , 0 , 22 , 0 ) , f i ( J−1 ,0 ,10 ,0) ) ; 70 data = f i ( 0 , 0 , 32 , 0 ) ; 71 data . hex = num2hex( s tddevI (1 , J ) ) ; 152 72 73 % Write s t dd e v I 74 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , address , data ) ; 75 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , address , data ) ; 76 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , address , data ) ; 77 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , address , data ) ; 78 end ; 79 80 % Empty c l o c k c y c l e s 81 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 32 , 0 ) , f i ( 0 , 0 , 32 , 0 ) ) ; 82 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 32 , 0 ) , f i ( 0 , 0 , 32 , 0 ) ) ; 83 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 32 , 0 ) , f i ( 0 , 0 , 32 , 0 ) ) ; 84 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 32 , 0 ) , f i ( 0 , 0 , 32 , 0 ) ) ; 85 86 %READ CLASSES 87 for K = 1 : c l a s s e s 88 for J = 1 : ( bands /5)∗8+1 89 % Address genera t ion 90 address = b i t conca t ( f i ( 1 , 0 , 14 , 0 ) , f i (K, 0 , 1 0 , 0 ) , f i ( J −1 ,0 ,8 ,0) ) ; 91 % Read c l a s s e s 92 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , address , f i ( 0 , 0 , 32 , 0 ) ) ; 93 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , address , f i ( 0 , 0 , 32 , 0 ) ) ; 94 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , address , f i ( 0 , 0 , 32 , 0 ) ) ; 95 [ ˜ , c l a s s r e a d (J ,K) ] = step ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , address , f i ( 0 , 0 , 32 , 0 ) ) ; 96 end ; 97 end ; 98 T = tab l e ( c l a s s r e ad , c l a s s ) ; 99 wr i t e t ab l e (T, ’classes.xlsx’ , ’Range’ , ’B1’ ) ; 100 101 %READ MEANS 102 for J = 1 : ( bands /5) ∗8 103 % Address genera t ion 104 address = b i t conca t ( f i ( 1 , 0 , 22 , 0 ) , f i ( J−1 ,0 ,10 ,0) ) ; 153 105 % Read means 106 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , address , f i ( 0 , 0 , 32 , 0 ) ) ; 107 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , address , f i ( 0 , 0 , 32 , 0 ) ) ; 108 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , address , f i ( 0 , 0 , 32 , 0 ) ) ; 109 [ ˜ , mean read (J , 1 ) ] = step ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , address , f i ( 0 , 0 , 32 , 0 ) ) ; 110 end ; 111 T = tab l e (mean read , means ’ ) ; 112 wr i t e t ab l e (T, ’means.xlsx’ , ’Range’ , ’B1’ ) ; 113 114 %READ STDDEVI 115 for J = 1 : ( bands /5) ∗8 116 % Address genera t ion 117 address = b i t conca t ( f i ( 4 , 0 , 22 , 0 ) , f i ( J−1 ,0 ,10 ,0) ) ; 118 % Read s t dd e v I 119 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , address , f i ( 0 , 0 , 32 , 0 ) ) ; 120 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , address , f i ( 0 , 0 , 32 , 0 ) ) ; 121 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , address , f i ( 0 , 0 , 32 , 0 ) ) ; 122 [ ˜ , s tddev read (J , 1 ) ] = step ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , address , f i ( 0 , 0 , 32 , 0 ) ) ; 123 end ; 124 T = tab l e ( stddev read , stddevI ’ ) ; 125 wr i t e t ab l e (T, ’stddevs.xlsx’ , ’Range’ , ’B1’ ) ; 126 %Set Enable 127 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , f i ( 0 , 0 , 32 , 0 ) , f i ( 1 , 0 , 32 , 0 ) ) ; 128 % Set In t e r rup t Enable 129 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , f i ( 1 , 0 , 32 , 0 ) , f i ( 1 , 0 , 32 , 0 ) ) ; 130 % End i n i t i a l i z a t i o n 131 132 for K = 1 : l i n e s 133 sum = s i n g l e ( zeros ( samples , c l a s s e s ) ) ; 134 135 for J = 0 : samples − 1 136 for I = 0 : 5 : bands − 1 137 % Sta r t t e s t data genera t ion 138 in1 = b i t conca t ( f i ( I , 0 , 8 , 0 ) , f i ( J , 0 , 1 0 , 0 ) , d a t a f i (K, I +1,J+1) , l i g h t I f i ( I+1,J+1) , d a r k f i ( I+1,J+1) ) ; 154 139 in2 = b i t conca t ( f i ( I +1 ,0 ,8 ,0) , f i ( J , 0 , 1 0 , 0 ) , d a t a f i (K, I +2,J+1) , l i g h t I f i ( I+2,J+1) , d a r k f i ( I+2,J+1) ) ; 140 in3 = b i t conca t ( f i ( I +2 ,0 ,8 ,0) , f i ( J , 0 , 1 0 , 0 ) , d a t a f i (K, I +3,J+1) , l i g h t I f i ( I+3,J+1) , d a r k f i ( I+3,J+1) ) ; 141 in4 = b i t conca t ( f i ( I +3 ,0 ,8 ,0) , f i ( J , 0 , 1 0 , 0 ) , d a t a f i (K, I +4,J+1) , l i g h t I f i ( I+4,J+1) , d a r k f i ( I+4,J+1) ) ; 142 in5 = b i t conca t ( f i ( I +4 ,0 ,8 ,0) , f i ( J , 0 , 1 0 , 0 ) , d a t a f i (K, I +5,J+1) , l i g h t I f i ( I+5,J+1) , d a r k f i ( I+5,J+1) ) ; 143 % End t e s t data genera t ion 144 145 % Run data in t o system 146 s tep ( r e g r e s s i o n hd l , in1 , in2 , in3 , in4 , in5 , f i ( 1 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 32 , 0 ) , f i ( 0 , 0 , 32 , 0 ) ) ; 147 end ; 148 end ; 149 % Wait f o r i n t e r r u p t 150 i r q = f i ( 0 , 0 , 1 , 0 ) ; 151 while ( i r q . data ˜= 1) 152 [ i rq , ˜ ] = step ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 32 , 0 ) , f i ( 0 , 0 , 32 , 0 ) ) ; 153 end ; 154 155 for M = 1: c l a s s e s 156 for J = 0 : samples − 1 157 % Address genera t ion 158 address = b i t conca t ( f i ( 1 , 0 , 13 , 0 ) , f i (M, 0 , 6 , 0 ) , f i ( J , 0 , 1 3 , 0 ) ) ; 159 160 % Read r e s u l t s 161 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , address , f i ( 0 , 0 , 32 , 0 ) ) ; 162 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , address , f i ( 0 , 0 , 32 , 0 ) ) ; 163 [ ˜ , sum( J+1,M) ] = step ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , address , f i ( 0 , 0 , 32 , 0 ) ) ; 164 end ; 165 end ; 166 % Clear In t e r rup t 167 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , f i ( 2 , 0 , 32 , 0 ) , f i ( 1 , 0 , 32 , 0 ) ) ; 168 % Write r e s u l t s and expec ted to f i l e 169 T = tab l e (sum) ; 170 wr i t e t ab l e (T, ’results_lc.xlsx’ , ’Sheet’ , K, ’Range’ , ’B1’ ) ; 155 171 [ model , exact ] = c a l c u l a t i o n t e s t ( d a t a f i (K, 1 : bands , 1 : samples ) , dark , l i g h t I , means test , s t ddev I t e s t , c l a s s t e s t ( : , 1 : c l a s s e s ) ,K) ; 172 173 end ; 174 175 save t e s t d a t a 176 clear ; 177 load t e s t d a t a 1 % −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 2 % normal ize .m 3 % Compute the normal va lue as done in l o g i s t i c r e g r e s s i on c a l c u l a t i o n 4 % −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 5 function normal ized = normal ize ( data , dark , l i g h t I , mean, s tddevI ) 6 d i f f = max( s i n g l e ( data − dark ) , s i n g l e (0 ) ) ; 7 co r r e c t ed = min( s i n g l e ( d i f f .∗ l i g h t I ) , s i n g l e (1 ) ) ; 8 normal ized = s i n g l e ( ( c o r r e c t ed − mean) .∗ s tddevI ) ; 1 % −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 2 % inner produc t .m 3 % Compute the inner product as done in l o g i s t i c r e g r e s s i on c a l c u l a t i o n 4 % −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 5 function par t i a l sum = inner product ( normalized , c l a s s , p rev ious ) 6 product = s i n g l e ( normal ized ∗ c l a s s ) ; 7 par t i a l sum = s i n g l e ( product + prev ious ) ; 1 % −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 2 % c a l c u l a t i o n t e s t .m 3 % Compute the p r o b a b i l i t y us ing l o g i s t i c r e g r e s s i on and wr i t e to spreadshee t 4 % −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− 156 5 function [ model , exact ] = c a l c u l a t i o n t e s t ( da ta f i , dark , l i g h t I , mean in , s tddev I in , c l a s s i n , shee t ) 6 [ ˜ , c l a s s e s ] = s ize ( c l a s s i n ) ; 7 [ ˜ , rows , columns ] = s ize ( d a t a f i ) ; 8 part ia l sum mode l = s i n g l e ( zeros ( columns , c l a s s e s ) ) ; 9 pa r t i a l sum exac t = zeros ( columns , c l a s s e s ) ; 10 for M = 1: c l a s s e s 11 for J = 1 : columns 12 prev ious mode l = c l a s s i n (1 ,M) ; %in t e r c e p t 13 p r ev i ou s a c tua l = double ( c l a s s i n (1 ,M) ) ; 14 for I = 1 : rows 15 norm = normal ize ( s i n g l e ( d a t a f i (1 , I , J ) ) , dark ( I , J ) , l i g h t I ( I , J ) , mean in ( I ) , s t ddev I i n ( I ) ) ; 16 part ia l sum mode l (J ,M) = inner product (norm, c l a s s i n ( I+1,M) , prev ious mode l ) ; 17 prev ious mode l = part ia l sum mode l (J ,M) ; 18 19 pa r t i a l sum exac t (J ,M) = (min(max( double ( d a t a f i (1 , I , J ) ) − double ( dark ( I , J ) ) , 0) .∗ double ( l i g h t I ( I , J ) ) , 1) . . . 20 − double ( mean in ( I ) ) ) .∗ double ( s t ddev I i n ( I ) ) .∗ double ( c l a s s i n ( I+1,M) ) + pr ev i ou s a c tua l ; 21 p r ev i ou s a c tua l = pa r t i a l sum exac t (J ,M) ; 22 end ; 23 end ; 24 end ; 25 model = part ia l sum mode l ; 26 exact = par t i a l sum exac t ; 27 T = tab l e (model , exact ) ; 28 wr i t e t ab l e (T, ’results_lc.xlsx’ , ’Sheet’ , sheet , ’Range’ , ’P1 ’ ) ;