DEVELOPMENT OF A SMART CAMERA SYSTEM ON AN FPGA
by
Monica Jane Whitaker
A thesis submitted in partial fulfillment
of the requirements for the degree
of
Master of Science
in
Electrical Engineering
MONTANA STATE UNIVERSITY
Bozeman, Montana
November, 2016
c©COPYRIGHT
by
Monica Jane Whitaker
2016
All Rights Reserved
ii
ACKNOWLEDGEMENTS
I would like acknowledge the faculty and staff of the Electrical and
Computer Engineering Department as well as those of the Gianforte School
of Computing at Montana State University for their continued support and
encouragement throughout my undergraduate and graduate education.
Funding Acknowledgment
This work was kindly supported by the Montana Research and Economic
Development Initiative, Montana Board of Research and Commercialization
Technology and Resonon, Inc.
iii
TABLE OF CONTENTS
1. INTRODUCTION AND BACKGROUND......................................................1
Introduction .................................................................................................1
Hyperspectral Imaging ..................................................................................3
Classifying a Hyperspectral Image.............................................................3
Object Sorting ..............................................................................................4
Smart Cameras .............................................................................................5
Existing Smart Cameras ................................................................................6
Matrix Vision ..........................................................................................6
Matrox Imaging .......................................................................................7
Eye Vision Technology .............................................................................7
Teledyne Dalsa ........................................................................................8
The Winner is ... None of the Above .........................................................8
2. MOTIVATION ........................................................................................... 10
Beneficiaries ............................................................................................... 10
Current Processing System .......................................................................... 11
Why FPGA? .............................................................................................. 12
3. SYSTEM DESIGN...................................................................................... 15
Logistic Regression Algorithm...................................................................... 15
Hardware Elements ..................................................................................... 17
Arria 10 SoC ......................................................................................... 17
Development Board Components ............................................................ 18
Additional Custom Boards ..................................................................... 18
Project Overview ........................................................................................ 20
Camera Interface ................................................................................... 21
DRAM Interface .................................................................................... 23
Computation Unit.................................................................................. 25
Pixel Classification............................................................................ 25
Object Classification ......................................................................... 27
FPGA Interface ..................................................................................... 28
Performance .......................................................................................... 32
iv
TABLE OF CONTENTS - CONTINUED
4. IMPLEMENTATION DETAILS .................................................................. 33
Programmable Oscillator ............................................................................. 33
Registers ............................................................................................... 33
Programmable Clock Generator ................................................................... 36
Design Decisions .................................................................................... 36
Registers ............................................................................................... 38
Burning a Configuration ......................................................................... 38
Utilizing the Clock Generator ................................................................. 39
Altera IP.................................................................................................... 39
Timing Constraints ..................................................................................... 40
Toolchain Fights ......................................................................................... 43
SignalTap .............................................................................................. 43
TimeQuest Timing Analyzer................................................................... 44
Chip Planner ......................................................................................... 44
MATLAB.............................................................................................. 45
Toolchain Tricks ......................................................................................... 47
5. TEST AND VERIFICATION...................................................................... 50
Camera Interface ........................................................................................ 50
DRAM ....................................................................................................... 50
Computation Unit....................................................................................... 51
FPGA to FPGA Transmission ..................................................................... 54
6. CONCLUSION........................................................................................... 56
REFERENCES CITED.................................................................................... 57
APPENDICES ................................................................................................ 60
APPENDIX A: Register Descriptions ........................................................ 61
APPENDIX B: VHDL Code ..................................................................... 65
APPENDIX C: MATLAB Code .............................................................. 141
vLIST OF TABLES
Table Page
3.1 Pixel Information in DRAM................................................................. 25
4.1 Register settings for Si570.................................................................... 34
A.1 ENABLE Register Description ............................................................. 62
A.2 IRQ ENABLE Register Description...................................................... 62
A.3 IRQ PENDING Register Description .................................................... 62
A.4 NUM BINS Register Description .......................................................... 63
A.5 NUM PIXELS Register Description...................................................... 63
A.6 NUM CLASSES Register Description ................................................... 63
A.7 FRAME COUNT Register Description ................................................. 63
A.8 MEAN Register Description................................................................. 63
A.9 STD DEV I Register Description ......................................................... 63
A.10 COEFFICIENT Register Description.................................................... 63
A.11 INNER PRODUCT Register Description .............................................. 64
A.12 DECISION VECTOR Register Description........................................... 64
vi
LIST OF FIGURES
Figure Page
1.1 A mock-up of the full system as it is intended to operate. ........................2
1.2 An example of hyperspectral line scan images over several frames.............4
1.3 Robot sorting almonds...........................................................................5
2.1 Depiction of a typical image processing system [1]. ................................ 11
2.2 This is a depiction of the future of image processing, with an
integrated camera sensor and FPGA processor [1]. ................................ 12
2.3 Graphical depiction of relative resources in the Arria 10 SoC chip .......... 14
3.1 Example inner product calculation. ...................................................... 16
3.2 High level view of the components external to the SoC utilized
in the system. The colored regions depict the individual PCBs............... 19
3.3 The PCBs created for the hyperspectral camera. ................................... 20
3.4 Block diagram of the full system functionality ....................................... 21
3.5 Block diagram of the camera interface subsystem. ................................. 22
3.6 Block diagram of the memory subsystem. ............................................. 24
3.7 Block diagram of the computation subsystem........................................ 26
3.8 The connection between the Arria V development board (top)
and the Arria 10 development board (bottom). ..................................... 29
4.1 Factory Default Clock Register Settings for Si570 .................................. 35
4.2 Preferred Clock Register Settings for Si570 ........................................... 35
4.3 Diagram of Pin Assignments for VersaClock 6 Programmable
Clock Generator .................................................................................. 37
4.4 A fitted floor plan in the Arria 10......................................................... 46
5.1 Generated plot depicting ratios between the pixels of the line
scan camera and the pixels of the hyperspectral camera. ........................ 53
5.2 Zoomed in plot of ratios between monochrome and hyperspec-
tral camera pixels ................................................................................ 53
vii
ABSTRACT
In recent years, hyperspectral cameras have been appearing in many applications
that need more information than what conventional color cameras can provide. A
hyperspectral camera is able to capture data ranging in wavelengths from the visible
spectrum all the way into the infrared. In this way, it is able to ’see’ hundreds of colors,
much more than the human eye or any standard camera that typically uses only 3
spectral values (corresponding to the standard red, green, and blue colors). Due to
the large amount of data that these cameras can generate at increasingly faster frame
rates, conventional computers are not able to perform all the necessary processing in
real-time. Because of this limitation, a new system is needed to perform the image
processing. This master’s thesis is meant to contribute to the development of a smart
camera targeted for hyperspectral image processing using a Field Programmable Gate
Array (FPGA) and object sorting with a prototype waterfall system. Through the
use of a Hardware Description Language (HDL), a currently used image processing
algorithm has been implemented to classify pixels. Additionally, design and test
of an architecture for full object classification has been developed for the FPGA.
High-speed transceivers are used to move data between multiple FPGA development
boards. When paired with a hyperspectral camera and a monochrome line scan
camera, this prototype system is capable of scanning objects in freefall and deciding
within milliseconds whether or not to keep the object. This decision will dictate the
action of air jets to displace unwanted objects. This full system is potentially of
interest to small businesses or farms as it will enable farmers to perform their own
premium bulk sorting in a cost effective manner.
1INTRODUCTION AND BACKGROUND
Introduction
A smart camera system is being developed to target sorting applications using a
hyperspectral camera. The overall system in development includes the hyperspectral
camera, a monochrome line scan camera and a sorting mechanism that uses air jets
to perform the physical sorting. This camera system will replace existing systems by
removing the need for cables between the camera and the processing unit as well as
replacing conveyor belts and robots with a vibrator feeder and air jets. In doing so,
with the help of the hyperspectral data, sorting may become more accurate and the
unit may end up being cheaper and consequently more accessible to small businesses.
This project is a prototype for the end result and is consequently not as compact as the
final product is anticipated to be, but it performs all the necessary calculations and
produces a result to trigger the air jets for the sorting of objects with high precision
due to the inclusion of hyperspectral data. This smart camera system utilizes two
System-on-Chip (SoC) devices that each consist of a Field Programmable Gate Array
(FPGA) fabric and a Hard Processor System (HPS) implemented on a single silicon
chip for easy and fast interactions. The fabric of these SoC FPGAs is used for the
processing of all data generated by the cameras, while the HPS is utilized in user
interactions and memory transfers. The monochrome line scan camera is included
for detection of the objects at the time of imaging and building an object profile
for the processing unit to make a complete object decision based on the compiled
individual pixel decisions. The decisions are made based on classes designed around
the hyperspectral characteristics found using the hyperspectral camera included in
this system.
2Figure 1.1: A mock-up of the system as it is intended to operate. The product
will fall from the conveyor belt and be imaged by both cameras (one high resolution
monochrome and one hyperspectral) simultaneously before either continuing its fall
or being ejected by air jets.
This thesis focuses primarily on the development and implementation of the
image processing algorithm in addition to the interaction between development
boards. The air jet system is in development by a separate team of engineers, as
is the monochrome camera processing subsystem that identifies object boundaries
for the hyperspectral camera. In implementation of the prototype design for this
project, the author of this thesis is responsible for the development and testing of the
image processing algorithm for the hyperspectral camera data and the transceiver
communication between development boards. The author also worked with the
development tools to compile the whole project and fix timing errors. Additionally,
3this author set up the data access method between the FPGA and the off-chip
Dynamic Random Access Memory (DRAM) connected directly to the FPGA fabric
via dedicated and hardened DRAM controllers. The details of this system are
abstracted away and a few control lines are available for use by other subsystems.
Hyperspectral Imaging
Resonon defines a hyperspectral image as a digital image with far more spectral
information for each pixel than traditional color cameras. The resulting data can
be pictured as a cube with dimensions in the spatial x and y directions and a third
dimension in the spectral wavelength, as seen in Figure 1.2. The cameras utilized in
the sorting applications explored within this thesis are line scan cameras, so a frame
consists of a single line (spatial y = 1) of pixels (spatial x) and then the spectral
bands occupying the ’third’ dimension. With the extra wavelength values, including
those in the infrared, hyperspectral cameras are able to sense much more information
than the human eye and your typical RGB color camera. This technology is used in
anything from remote sensing to quality control to sorting [2].
Classifying a Hyperspectral Image
Every pixel within an object contains nearly unique spectral signatures which
can be used to classify it. In order to do so, a class is defined by compiling a
variety of images of the object and determining the spectral signature that most
commonly defines the pixels within the object. This is done for each of the possible
classes expected to be seen in the surveyed objects. In this way, each class is a
vector of values across all the spectral bands. Using these classes, a statistically-
based machine learning algorithm is used in order to come up with a probability that
the pixel belongs to each class. A number of different machine learning algorithms
4Figure 1.2: An example of hyperspectral line scan images over several frames
(lines). [3]
are acceptable for this objective, however a logistic regression approach has been
chosen for the implementation of the prototype based on currently used systems. For
logistic regression, the considered computation utilizes an inner product calculation
between the spectral signature of the pixel and the considered classes to reach a
probability. The highest probability is kept for each pixel in an object and by adding
the probabilities of each class over the object, the highest probability is kept and
determined to be the class to which the object belongs.
Object Sorting
Sorting has been a large concern in developed nations for many years as
manufacturers strive to put out quality products. It is gaining popularity in
developing countries and becoming even more important in the places where it already
exists, as quality control is brought to the forefront of society’s attention. Particularly
in the food industry, increased industrialization has brought forward a push toward
healthy convenience foods. Sorting is also very important in agricultural applications
5as farmers need to sort their crops after harvesting. In many cases, this is done by an
industrial company who then reports back with the percent loss due to the sorting
mechanism. Of course, by sending it away, farmers have no way of verifying the
reported loss and it would be easier and more reliable for them to have their own
means of sorting. Machine sorting helps to avoid the inconsistent nature of manual
sorting [4] and avoids significant loss of good product that may result from vibration
sorting or other mechanical means. Sorting machines come in a variety of shapes,
sizes, and technologies. These include using lasers, cameras, or x-ray in conjunction
with robotic arms or air jet systems to sort products and separate the bad from the
good [4]. As the technology advances, the sorting abilities will be expected to do so,
as well.
Figure 1.3: Robot sorting almonds [2]
Smart Cameras
In many applications, cameras of all varieties are used to acquire data for
studying something about the subject matter that can be viewed at a later time. This
6data is also generally processed external to the system in which it abides. However, it
is becoming more necessary and common for processing to happen on-board, enabling
the system to adjust in real-time. Because of this, real-time processing is in high
demand and the techniques are still being perfected. Further, the algorithms to
process data generated by cameras are in constant refinement as researchers learn
what they want to see from the data and how best to achieve those results. As
algorithms are refined and cameras are built to generate more data than ever before, it
becomes necessary to have the right infrastructure to support the real-time processing
of imaging data, and thus we find the niche for a smart camera.
Existing Smart Cameras
There are several smart cameras currently in existence, including some with a
programmable FPGA or select-able image processing algorithms to utilize in the
desired application. These include several by Matrix Vision and some by other
companies such as Matrox Imaging, EVT, and Teldyne Dalsa as further described
in the following subsections. Though these cameras are likely very useful in
some applications which require on-board image processing, they lack the real-time
processing advantages gained in the use of the Arria 10 FPGA, which are detailed in
the last subsection below.
Matrix Vision
Matrix Vision has created several smart cameras, two of which are notable for
image processing in industry. The mvBlueGEMINI is touted as a ’Tool box technology
camera’ and includes an SoC with FPGA and Dual-Core Cortex-A9 with 800 MHz-
capable clocks and a camera sensor with 1280 x 1024 resolution. The software that
7comes with the camera includes a Graphical User Interface (GUI) with which users
can choose the task to complete. The frame rate on this single sensor is unspecified. [5]
The other smart camera by Matrix Vision is the mvBlueLYNX-X. This camera
has options for either CCD or CMOS sensors in addition to a hybrid dual core. This
features a Cortex-A8 ARM CPU with a separate real-time Digital Signal Processing
(DSP) unit with video interface. The CPU has a clock speed up to 1 GHz, while the
DSP can run up to 800 MHz. Though available in a number of different resolutions,
the largest, 2592 x 1944 has a maximum frame rate of 14.4 Hz and the next largest,
1280 x 1024 has a maximum frame rate of 60 Hz. [6]
Matrox Imaging
Matrox makes a smart camera entitled the Iris GT which comes with a design
assistant and a web-based interface for the integrated development environment. This
camera has an Intel Atom embedded processor running Windows as well as a built-
in keyboard, monitor, and mouse for friendly user interface. It is compatible with
a variety of monochrome and color CCD sensors. Matrox claims this camera and
software are ideal for a variety of machine vision applications including agriculture,
aerospace, and more. The highest frame rate is 110 frames per second (fps), with an
effective resolution of 640 x 480. [7]
Eye Vision Technology
Eye Vision Technology (EVT) creates several variations of smart cameras. The
RazerCam, for instance, is packaged with a free programmable FPGA and two ARM
Cores based on the Xilinx Zynq SoC. Users are limited to choosing between one of
two matrix sensors or a line scan sensor with 4K resolution. That line sensor claims
a frame rate of 10000 fps with 10-bit pixel data, but the matrix sensors are not above
60 fps. The ARM cores are running Linux for user convenience in interaction. The
8greatest downfall of this camera is the lack of hardened floating point on the FPGA,
which could hinder the speed or accuracy of results, not to mention development
speed. EVT also has a series of EyeCheck smart cameras which are almost all around
30 or 60 fps at resolutions in the thousands. One version has 180 fps and a Xilinx
Artix-7 FPGA with 28K logic elements. This FPGA is in the low-end of Xilinx’s
product portfolio, designed to optimize power and cost. [8]
Teledyne Dalsa
Teldyne Dalsa offers several vision systems with embedded software applications.
There are monochrome sensors available with resolution up to 1600 x 1200. The
processing includes an embedded CPU and DSP with a choice of software. These
are built robustly for integration in factory environments. These cameras are ideal
for still-image quality control and do not have the high clock rate possible with an
FPGA. [9]
The Winner is ... None of the Above
As seen here, there are many different options for smart cameras already on the
market that could be fitted with a hyperspectral front end and used for sorting objects.
But ultimately, none of these were chosen. This is because they lack what could be
known as the best combination of options. Some of these are outfitted with DSP
software and pre-programmed algorithms to choose from. Others use FPGAs for user
configurability. However, the DSP software is not all-encompassing and the FPGAs
more than likely do not have hardened floating point blocks. In the application
space targeted here, the hardened floating point is particularly valuable for ’cheaper’
calculations with greater accuracy. Further, by using only an FPGA to do all the
processing, any algorithm could be configured and used. Real-time processing also
greatly benefits from the deterministic latencies of FPGAs whereas other systems are
9compromised by the inclusion of numerous memory accesses or operating systems.
Additionally, the sensors available for these cameras have frame rates less than 100
fps in most cases and the ideal sensor will be collecting data much faster than that.
For these reasons, it was decided that a new smart camera should be developed and
thus, this project was born.
10
MOTIVATION
Beneficiaries
This project is done in support of, and with support from, Resonon Inc. in
the belief that they will be able to utilize the smart camera in their machine vision
technology systems. Upon completion of a system prototype, of which this thesis
is a subsection, they could utilize the processing method and small form factor in
other integrated systems that they pair with their optical technology. Further, the
Montana Board of Research and Commercialization Technology helped to start the
work on this project and its first proof of concept iteration as they were providing
the primary source of funding for materials and man-hours spent developing this
technology implementation.
The primary focus for this smart camera implementation is in food sorting, but
the technology could be utilized in any sort of assembly-line environment requiring
quality assurance checks. Currently, in areas using the Resonon machine vision
technology, there is still the need for manual sorting after the machine has performed
its sorting because the current system is not capable of processing all the necessary
data at a sufficiently fast speed in order to make highly accurate classifications. Due
to the unavailability of a suitable image processing system, the images are lower
resolution than the Resonon optics technology is ultimately capable of in order to
allow processing to be done on a traditional PC. The goal of having an efficient
real-time integrated machine sorting system that is able to process high-resolution
images, is to eliminate the need for manual sorting after the machine which will free
up workers for other tasks. Using an FPGA enables a fully customizable smart camera
implementation that could apply in several application areas.
11
Current Processing System
A typical image processing system is shown in Figure 2.1. This system is
comprised of a camera, a frame grabber to configure and capture image data from
the camera, and a computer to perform the processing.
Figure 2.1: Depiction of a typical image processing system [1].
Though this method has worked for many years, it limits the speed at which
images can be processed due to several bottlenecks. The first bottleneck comes from
the cables that limit bandwidth. The second bottleneck is the speed of the computer
that limits the speed of real-time computations. The previous proof-of-concept system
utilized Camera Link connections to connect the camera to the FPGA. The Camera
Link standard was designed modeling the Channel Link technology, which is able to
transmit data at up to 2.38 Gbps [10]. With current applications of image processing,
the need for real-time results is growing and placing a strain on the capabilities of
existing systems. This project seeks to provide a solution for the replacement of these
traditional systems by integrating all three components as shown in Figure 2.2 and
removing the need for any cables. The proposed implementation involves short ribbon
cables to move data between the board housing the camera sensor and the FPGA
board. This is done so that the camera sensor can be easily placed at a 90◦ angle
12
to the board for this prototype. These ribbon cables will eventually be replaced by
board-edge connectors since the cables are not required for implementation purposes,
as far as the data movement is concerned and could easily be removed in a final
product.
Figure 2.2: This is a depiction of the future of image processing, with an integrated
camera sensor and FPGA processor [1].
Why FPGA?
One of the biggest advantages of FPGAs over standard computers is the
deterministic low-latency data paths achievable in custom application-specific archi-
tectures. CPUs have fixed architectures and variable latency depending on where
the data is stored or moved (cache, main memory, etc.). FPGAs are made of
programmable logic blocks, SRAM, and DSP blocks that can be reconfigured for
varying applications. The architecture of an FPGA is like a grid, with logic connected
over interconnects between blocks. Because of this structure, FPGAs have no
cache and a flexible architecture can be programmed by the user using a hardware
description language. In doing so, ultimate control is maintained over what occurs on
each clock cycle and even where each of the internal registers are placed to garner the
best path through the device. The deterministic latency is key for real-time systems
13
as the user is able to guarantee that the performance is real-time. Fabric is also easily
expanded by adding more logic blocks, which enables manufacturers to create similar
devices of varying size and complexity. In this way, FPGAs have been tailored to be
suitable for a whole market of people with varying needs, resources, and cost targets.
In this project two SoCs are used, instead of the standard FPGA that does not
include the ARM CPUs. A SoC contains a dual-core ARM processor on the same
chip as the FPGA along with hardened peripheral (Ethernet, USB, etc.), which is
referred to as a Hard Processor System (HPS). Providing an HPS that can serve
as a smart interface between the FPGA logic and the outside world makes it easier
to communicate with external computers for the passing of data. This is generally
accomplished using the Ethernet connection to achieve an IP address on the Linux
system running on the HPS. However, while the HPS is still functioning as a standard
computer and is subject to timing constraints applied by the OS scheduler, the
FPGA can continually be running and performing the computationally demanding
or timing specific tasks concurrently. It is able to send interrupts to the HPS and
the HPS can read or write to the memory available to the FPGA. A depiction of the
resources available and their relative locations within the Arria 10 SoC chip is shown
in Figure 2.3. The architecture of the Arria V SoC is similarly laid out, though with
lower-level technology in the transceivers and DSP blocks.
14
Figure 2.3: Graphical depiction of relative resources in the Arria 10 SoC chip [11]
15
SYSTEM DESIGN
Logistic Regression Algorithm
One of the simplest hardware-implemented classification algorithms is logistic
regression. The inputs are a vector each of classification coefficients, means, and
standard deviations, the input data, and matrices of a full bright image and a full
dark image. The equations describing the process are:
normalized = (data− dark)/white (3.1)
corrected = (normalized−mean)/standard dev (3.2)
product = coefficient ∗ corrected+ previous (3.3)
In these equations, previous represents the running sum. It starts with the zeroth
coefficient value and subsequent products are added on. An example is shown here
assuming input vectors of the corrected data and the coefficients. The coefficients
vector is one more in length than the input vector.
1 for i in 1 to NUMBER BINS
2 i f i = 1
3 product = c o e f f i c i e n t ( i ) + co r r e c t ed ( i ) ∗ c o e f f i c i e n t ( i +1) ;
4 else
5 product = product + co r r e c t ed ( i ) ∗ c o e f f i c i e n t ( i +1) ;
6 end
7 end
The final step of the logistic regression calculation includes calculating the
probability associated with the inner product result. This probability calculation
is as in Equation 3.4 where X = product after the inner product is completed. For
this system, the classification is not dependent on the probability value, only on the
relative probability. Given the one-to-one mapping between the inner product result
and the probability, it is sufficient to use the inner product result as a representative
16
of the relative probability for each class to determine the class that each pixel most
likely belongs to.
P =
1
1− e−X (3.4)
To ensure that the computations are hardware-friendly, only multiplications and
additions are implemented. This means, that any numbers needing to be divided are
first inverted in the software before being ported to the hardware. The computation
of logistic regression involves matrix inner products. Given normalized inputs and
classification coefficients, the inputs are multiplied by the corresponding coefficients
and a running sum is kept over a pixel to achieve a single result representing the
probability that the pixel belongs to that class.
Figure 3.1: Inner product calculation with vector of coefficients and matrix of
normalized values with dimensions of number of bins by number of pixels.
The normalization that takes place uses the white and dark values that are
passed with each data input as well as stored mean and standard deviation values that
represent the mean and standard deviation across the spectral bins from a training
set. The white and dark values are used to normally distribute the data between 0
and 1, while the mean and standard deviation account for the frequency of particular
values. The dark value is subtracted from the data and the result is multiplied by
the inverse white value. Subsequent operations involve subtraction of the mean and
multiplication by the inverted standard deviation. The inverse of both the white
values and the standard deviations are calculated externally before being stored in
17
system memory so no hardware divides are required, enabling fewer resources used
and faster clocks.
Hardware Elements
Arria 10 SoC
The Arria 10 SoC by Altera (which was acquired by Intel in 2015) was chosen
for the primary computation engine on this project for several reasons. The primary
reason is its hardened floating point units which enables the device to allow for over
1.5 trillion floating point operations per second of performance [12]. This is the
first device on the market to contain single-precision floating-point multipliers and
adders incorporated into the hard DSP blocks [12], the addition of which provides
a great improvement in system development since fixed-point algorithms take much
more effort to develop and soft floating-point calculations use unnecessary amounts
of resources to create floating-point multipliers in the FPGA fabric. In addition,
the largest device in this family has 660K logic elements (LE), over 42Mb M20K
memory, and up to 48 transceivers capable of 17.4 Gbps [11]. This device is the
best middle-class FPGA on the market today. Future iterations of this system could
utilize different versions of the Arria 10 or move to a higher-end device in the Stratix
10 (the largest of which will have around 30 billion transistors [13]). The Arria 10
is the best FPGA for this purpose right now because of its high performance, which
surpasses the speed requirements of the cameras, for data alignment purposes while
still maintaining affordability. Also, the Stratix devices, while better in number of
logic elements and DSP performance, really excel in transceiver performance and
are best suited for tasks involving high transmission. Though this design does use
transceivers, and may benefit in the future from moving to these more advanced
18
devices, it is not necessary to have the higher capability given the limit of the data
rate from the cameras.
Development Board Components
In addition to the Arria 10, other components on the development board that
were utilized for this project include the DDR3 DRAM, the SMA connectors, and
the FMC connector. The DRAM on the board is 1 GB of memory for each of the
FPGA and HPS to utilize. This is used for storing the light and dark matrix values.
The SMA connectors are used as the interface for the transceivers to communicate
with the Arria V FPGA. A daughter card was developed to plug into the FMC for
the purposes of bringing camera data into the FPGA.
Also an important part of the project is the Arria V SoC, which is also on
a development board that includes SMA connectors, an FMC connector, a Max V
CPLD, and a programmable oscillator. The SMA connectors here are again the
interface for the transceivers. The FMC is used for the custom daughter card to
connect the monochrome camera to the FPGA and the oscillator is utilized for
achieving the ideal clock frequency for the transceiver communications. The oscillator
is programmed over I2C by the Max V, which has to be programmed separately
prior to running the desired program on the FPGA. The high level diagram of system
components is shown in Figure 3.2. Not shown is the external PC which will interact
with the HPS over Ethernet. While in future system implementations, the Arria V
will be replaced with an Arria 10, it is currently used for the monochrome camera
subsystem because of its initial use in the development of this subsystem.
Additional Custom Boards
In addition to the two FPGA development boards the project also required the
creation of several daughter cards, i.e. printed circuit boards (PCBs). Three cards
19
Figure 3.2: High level view of the components external to the SoC utilized in the
system. The colored regions depict the individual PCBs.
were developed using the PADS software from Mentor Graphics [14], by teammates
Connor Dack and Alex Matejunas. The first card is designated the ’sensor board’,
which contains all of the circuitry to connect to the lines of the CMOS sensor chosen
to be the face for the hyperspectral front-end. All of the lines are drawn out to two
100-pin ribbon cable connectors. This board is separate for the purposes of being
able to orient at a 90◦ angle to the rest of the boards, but also so it is modular and
could be easily swapped with a different sensor, should the need or desire arise. A
second board contains more ribbon cable connectors and connects the data from the
cables to the FMC, which will bring it into the FPGA for processing. This board
also contains circuitry for the transceivers, including SMA connectors and a clock
generator to provide a reference clock, since the Arria 10 development board does not
contain any SMA connectors for transceivers. Both boards are shown in Figure 3.3
connected to the Arria 10 pre-production development board.
20
Figure 3.3: The PCBs created for the hyperspectral camera connection to the Arria
10 FPGA. They are shown attached to the FMC, without the ribbon cables and
coaxial cables.
A board was also created to connect to the FMC on the Arria V with inputs
for the monochrome line scan camera’s Camera Link cables. A custom board was
created for this purpose because not all of the FMC pins are connected on the Arria
V development board, though they are needed for communication with the camera.
Consequently, a daughter card was developed to appropriately map the camera signals
to connected FMC pins on the FPGA.
Project Overview
In order to implement a processing system on the FPGA, the tasks required
were broken into system blocks as detailed in the following sections. These blocks are
the camera interface, the DRAM interface, the computation block, and the FPGA-
to-FPGA interface, which encompasses the communication system between the two
boards in order to send signals to the air jet system and transmit object information.
21
Figure 3.4: Block diagram of the full system functionality
Camera Interface
In order to integrate the camera sensor on this prototype system, two additional
boards were fabricated. The first board houses the sensor and has connectors for
the data to pass to the second board, which is connected to the development board
housing the FPGA, and routes all the data signals to the appropriate places to be
accessed from the FPGA as well as ensuring the control signals are appropriately
routed. This board also contains clock generator circuitry and SMA connectors
for transceiver communication purposes. As previously mentioned, the two boards
are connected via ribbon cable in the prototype to allow the sensor to be at a 90◦
angle from the other boards. Configuring the sensor on its own board makes this a
modular product, in which other sensors (on their respective boards) could be used
to replace the current one so long as the signals are routed in the same way through
the connectors.
22
Figure 3.5: Block diagram of the camera interface subsystem.
The programmable hardware interface for the camera consists of a state machine
to compile all the bits per pixel as they are presented, and attaching location
information which describes which pixel and spectral band the data belongs to. It
also pulls the data from DRAM through a FIFO interface and verifies the location
information matches that of the pixel that is being compiled. This interface also
sends any control signals to the camera required for triggering a start and providing
a clock signal to the camera.
The latency in the Camera Interface is determined based on the camera data
rate as well as the number of cycles needed to delay the data before it is passed on
in order for it to be parallelized. Since the data is presented in 10 taps, one bit at a
time, there is a delay needed to accumulate all the relevant bits per pixel per band.
Additionally, to create 5 parallel channels, the data is delayed because it is initially
presented serially. It was chosen to add this delay and parallelize because while it
slows down the initial presentation of the data to the computation unit, the increase
in computations completed through parallel channels is great enough to overcome
this initial serial latency.
23
DRAM Interface
In order to account for the effects of the camera, all incoming data is normalized
by the white and dark values as in Equation 3.1. These are meant to correspond to
the largest and smallest possible data values that have been, or could be, seen. This
data is captured in still images taken prior to operation of the system. The dark
image is taken while the lens cap is on the camera to provide a theoretical darkest
environment possible. In contrast, the light image is captured while the camera is
fixed on a white strip that is lit up to its brightest value seen given the operational
lighting conditions. Given that these are full matrices, with potentially variant values
at each pixel/band, all the information captured needs to be stored. Due to its size, it
was decided to store this data in off-chip Dynamic Random Access Memory (DRAM)
so that there is still plenty of room for more frequently accessed and changing data
in the on-chip SRAM memory. This was also deemed an acceptable choice of storage
location because the values are only accessed before the computations and on-chip
memory is used to buffer the values as they are accessed, so there is less time-critical
need of the data from the time of address specification (i.e. the DRAM matrix values
are pre-fetched, alleviating any effects of DRAM refresh stalls).
DRAM consists of a grid of capacitors and transistors where each capacitor is
capable of storing a single bit based on its voltage level. The transistor is used to
access that particular capacitor and charge or drain it as necessary. The memory has
to be refreshed occasionally to keep the stored values as the capacitors drain their
charge over time. Since each stored bit only requires a single transistor and capacitor,
this memory is very dense and cheap, making it attractive in industrial applications.
However, DRAM is not quick to access in comparison to SRAM that is located in
the FPGA fabric. The timing of controller interactions with the memory is also
24
Figure 3.6: Block diagram of the memory subsystem.
very technically challenging, but is handled through a hardened memory controller in
the FPGA that provides the direct interface functionality. A custom controller was
developed to be the master of this interface and this controller issues read or write
commands.
As camera data enters the system, prior to being processed in the computation
subsystem, it needs to be properly aligned with the corresponding location’s white
and dark matrix values. In order to accomplish this, the matrix values are pulled
from memory sequentially in the same order that camera data is received. It is stored
in such a way that this order can be accessed sequentially in the memory. By doing
so, bursting access, reading multiple address locations in multiple sequential memory
accesses, can be used to take advantage of the structure of DRAM. A pre-buffering
system is in place to hold the bursting read data and enable alignment with the data
25
that is coming in faster than the DRAM can be accessed, if each location were to be
read individually. The white and dark values for each location are stored together,
requiring only one memory access per location. This was done to facilitate ease of
access and use as both values need to be aligned with the incoming data. Additionally,
the bus between DRAM and the FPGA can sustain signals of the width needed to
accommodate each of the data values and the location (see Table 3.1). This buffer’s
output is made available to the camera interface to enable the data alignment with
incoming values.
Table 3.1: Information Associated With a Pixel in DRAM
127 0
31 ZERO PADDING 0 31 LOCATION 0 31 DARK 0 31 WHITE 0
Computation Unit
The pixel and object classifications are done in the computation unit. This block
consists of the normalization step as well as the inner product engine to complete the
classification. It also compiles a full object classification, sorts the results, and makes
a decision at the end. The system is using linear regression to classify the pixels,
as was introduced in Section 3, Logistic Regression Algorithm. Presented below is a
detailed explanation of the block to classify the pixels, then subsequently the objects.
Pixel Classification In order to perform the logistic regression calculation on the
incoming pixels, there are a number of parallel blocks implemented. The first is the
normalization which performs the calculations in Equations 3.1 and 3.2 on the incom-
ing data in parallel. At this step there is a DSP block per calculation step per parallel
data channel. The mean and standard deviation values are stored in on-chip RAM for
easy access. The output of the normalization is passed to the inner product blocks.
26
Figure 3.7: Block diagram of the computation subsystem.
There is an inner product block for each class within each parallel channel. This
corresponds to NUMBER OF CLASSES*NUMBER OF PARALLEL CHANNELS
DSP blocks, as each inner product requires only one DSP unit. For this prototype
design, that means there are 16 ∗ 5 = 80 parallel DSP blocks used for the inner
product. The class coefficients are located in memory blocks for each class, addressed
by band number. At the end of each pixel, the results of the inner product blocks for
each parallel channel are added together to have one result per class. The output,
then, is a vector of probabilities designating how likely it is that the pixel belongs to
each class. This information is stored in on-chip memory for access by the user, as
well as being passed to the object classification subsystem.
The computational complexity of this classification is found by analyzing the
number of operations that could be happening at any one time. Once the system is
fully in operation, all of the normalize DSP blocks and inner product blocks can be
running at once. Assuming this is the case, the performance of the pixel classification
when running on the 70 MHz data clock with a 210 MHz operation clock is 4.55
GFLOPS. Concurrently, the on-chip memory bandwidth can be analyzed for each
of the instantiated memory blocks. Taking in to consideration the blocks for the
27
means, standard deviations, classes, intercepts, and results, the total on-chip memory
bandwidth is 44.8 GBytes/s. Much of the data pulled from these memory blocks is
actually not used, but is required to fulfill acceptable memory port width ratios. This
number was found using the 70 MHz clock rate that is used to read or write to the
memory on the processing side of the system.
Object Classification In classifying the full object, data is utilized from the
monochrome line scan camera as well as the pixel classifications obtained using
the hyperspectral data. The line scan camera is taking images at 80,000 frames
(lines) per second (fps) and the Arria V FPGA is performing calculations to find the
location of an object. The line scan line number and pixel number are translated into
the corresponding hyperspectral numbers on the Arria V to prevent transmission
of numerous repetitive entries. This translation is done because the monochrome
pixels are at a much finer resolution than the hyperspectral pixels. Information
about the object’s location is transmitted to the computation block including the
line number, an object number, and the beginning and ending pixel in that line that
defines the object. A record is kept in the computation unit on the Arria 10 of object
locations, which is used to accumulate pixel classifications within each object. After
an object disappears from the scan line, the overall results are used in a lookup table
to determine if the highest level classification probability is good or bad, though
future systems could look at the accumulated class probabilities of all the classes to
make a decision. The decision to eject the object is made off of this lookup and
sent to the Arria V, which is also controlling the air jets. The final sorted results
are made available to the HPS through a streaming process which feeds a modular
Scatter Gather Direct Memory Access (mSGDMA) block that will write streaming
data to SRAM belonging to the HPS.
28
Since not every pixel will need to be accumulated into an object, and pixel
results do not show up every clock edge, there is only one DSP block implemented in
this section. If it were constantly in operation, it would achieve a performance of 70
MFLOPS. The memory block that holds the accumulated results for each object has
a theoretical maximum bandwidth of 4.48 GBytes/s. This is theoretical because, as
with the adder, there will not be a memory access on every clock period due to the
intermittent nature of the data that will be accumulated.
The VHDL code for operation of the computation subsystem can be found in
Appendix B. These files are
• computation unit.vhd
– regression.vhd
∗ normalize.vhd
∗ channel sum.vhd
– sort.vhd
– object tracking.vhd
FPGA Interface
In order to communicate between the two camera subsystems, a high-speed (6
Gbps) serial interface was designed to connect the Arria V and Arria 10 FPGAs. The
monochrome line scan camera is connected to, and its data is processed on, the Arria
V while the hyperspectral camera is connected to the Arria 10 where the inner product
computations for classification occur. In order to make a full object classification, the
information about each object’s location needs to be passed from the Arria V to the
Arria 10 and the ultimate decision to keep or discard each object is sent back from
the Arria 10 to the Arria V, which also controls the air jet system. One reason for
29
Figure 3.8: The connection between the Arria V development board (top) and the
Arria 10 development board (bottom).
the two separate boards is the availability of FPGA Mezzanine Connectors (FMC)
on the development boards. The monochrome line scan camera requires two Camera
Link cables between the Arria V and the camera. The FMCs are connected to the
FPGA in such a way that both connectors are required to connect all the desired
signals for the camera. There are only two FMCs on the Arria V development board,
and consequently, both are used for this camera. The hyperspectral camera also uses
an FMC to connect to the FPGA. Since the Arria 10 is the larger device and includes
hardened floating point, it is imperative that this board is used for connecting to the
30
hyperspectral camera. Since a daughter card has been developed such that all the
correct signals are routed to one FMC for the monochrome camera, both cameras
could be connected to the Arria V but not the Arria 10 as the available board only
has one FMC. The Arria 10 board currently contains an engineering sample (pre-
production) version of the Arria 10 since we have not been able to get a production
evaluation board of the Arria 10. The production evaluation board will have two
FMCs and at that time, the full design can be ported to one board, provided the
logic will fit on the device, eliminating the need for the communication interface.
Unknown at this point is if the Arria 10 has enough resources to fit the full design or
if two Arria 10 FPGAs will be needed.
The communication interface between the two FPGA boards is through SMA
connectors connected to coax cables that use the high-speed transceivers in each of
the FPGAs. Using the SerialLite2 protocol, the transceivers can establish a link and
transmit data. SerialLite is a communication protocol that is particularly good for
high-speed serial communication and has less overhead than other serial protocols.
The protocol includes CRC checking as well as optional scrambling/descrambling
of the data. It can also be used with multiple receivers and a broadcast mode,
if desired. Though the SerialLite2 IP core provided by Altera is not yet readily
available for the Arria 10, it is indirectly supported. Since SerialLite3, which is
available for the Arria 10, is not compatible with SerialLite2 due to different encoding
schemes and packet structure differences, the SerialLite2 core was needed to be able
to communicate between the two boards. In addition to the SerialLite2 core, the
native transceiver phy cores are used for their respective devices. These implement
the PMA (physical medium attachment) aspect of the transceivers as well as some of
the PCS (physical coding sublayer) and handle the physical transmission of the data.
31
The SerialLite2 core then sits on top of the native core and handles additional PCS
tasks of transmission, such as providing a CRC for the data.
In order to set up the reference clock, which is required to be 156.25 MHz to be
compatible with the general transceiver accepted reference clocks, a programmable
clock was needed. At first, this was done using the programmable oscillator on the
Arria V development board, which provides the reference clock to the transceivers
that are connected to the SMA connectors on the board, and is programmable over
I2C from the Max V CPLD controller. After the FMC breakout board was completed,
in order to provide additional SMA connectors for testing of the transceiver channel,
the clock generator on the breakout board had to be programmed over I2C from the
FPGA to generate the desired frequency clock. The clock generator chosen for this
purpose has four one-time programmable (OTP) configurations, so that the correct
frequency can be loaded on power-up. After programming the volatile RAM with
the desired values, they were burned into a configuration on the OTP memory of the
generator and subsequent projects need only enable the output of the clock generator
to get the desired frequency. This made it much easier to ensure the right frequency
was available at transmission time, rather than programming the clock each time the
power was cycled.
On the Arria 10 pre-production development board, the FMC breakout board
can be used, however the reference clocks connected to the clock generator outputs
do not connect to the reference clock inputs on the same bank as the populated
transceivers. One of the reference clocks on this bank is provided by a programmable
oscillator on the development board that has approximately 10 clock outputs. Instead
of programming yet another oscillator, the SMA clock outputs were found to provide
a 156.25 MHz clock that can be transmitted over SMA to one of the receivers to be
used as a reference clock on the breakout board.
32
Performance
A significant benefit of this system is the increase in performance from the
previous method of processing. Previously, Resonon has been using a camera with a
frame rate of 140 fps with a spatial resolution of 640 pixels and a spectral resolution
of 240 bands. This system under development is comprised of a 500 fps camera sensor
(full resolution) running at 2000 fps (partial resolution) with a spatial resolution of
1024 pixels (reduced to 256) and a spectral resolution of 160 bands. A large increase
in spectral bands was neither needed nor desired by Resonon because they have found
that the data becomes redundant and unuseful after a certain point. With a clock
speed of 70 MHz on the computation side and approximately 157 cycles to classify
a pixel, this means that it takes only 1.57µs to compute the classification for a full
pixel. With 1024 pixels, this hyperspectral computation takes 1.6ms per frame (i.e.
line scan).
The monochrome line scan camera can run up to 80,000 lines per second,
and the transmission rate between the two boards is at 6250 Mbps. With 54 bits
needed to represent the information per object per line, objects are transmitted
at a rate of 115.74 MHz. When packaged in 32-bit data words and including
start and end packets, the transmission is still accomplished in 20.48ns. Since the
monochrome line scan calculations can run on an 83.5 MHz clock, it is able to keep the
transmission buffer full (i.e. calculations take place faster than they are needed), but
not overflowing since there will not be objects found on every clock edge. The decision
on the hyperspectral side is made using the 70 MHz clock that the computation unit
runs on, so there may be some dead spots in the return transmission. Upon receipt of
the object information, the line and pixel numbers are stored using the object number
in an array updated with each transmission to note where objects are on the line.
33
IMPLEMENTATION DETAILS
Programmable Oscillator
In order to achieve an accepted clock frequency for the transceiver reference clock
on the Arria V board, a clock generator had to be programmed. The first iteration
involved programming the programmable oscillator, a Si570 device from Silicon Labs,
provided on the development board. In order to achieve the desired frequency of
156.25 MHz, 6 of the available registers are required to be programmed via the I2C
lines which are connected to the MAX V CPLD system controller that is also on the
board. The oscillator does not have persistent memory, so it must be reprogrammed
after every power loss to consistently have the desired frequency on every run of the
device. This can be arranged by programming the Max V to run the I2C code as
part of the device configuration. Programming the oscillator requires knowledge of
the current frequency and register values as the calculations for new values are based
off the current configuration. These default values and the new values were obtained
using the Clock Controller GUI provided as part of the board test system from Altera.
Registers
The device has two sets of identical registers, one set for devices with 20 or
50 ppm temperature stability, and the other for devices with 7 ppm temperature
stability. The oscillator provided has 7 ppm temperature stability, 20 ppm total
stability as determined by the part number. The critical values needed to program the
registers are the output divider values (N1 and HS DIV ) and the crystal frequency
multiplication ratio (RFREQ). The output dividers are found by changing the
existing values as little as possible, but keeping the digitally controlled oscillator
(DCO) frequency within the acceptable range of operation. The factory default is
34
a 100 MHz clock with divider values and DCO frequency as shown in Figure 4.1.
Using the GUI, the necessary values to program were easily obtained as shown in
Figure 4.2. Though provided by this tool, they could also be found using a couple
of equations, which were utilized in the MATLAB script created to print the VHDL
constants for the programming of the registers. Based on the required values, all the
registers needed to be programmed with values as shown in Table 4.1. The steps to
derive these values are in Equations 4.1, 4.2, and 4.3 [15].
The RFREQ value is a 38-bit number with 28 decimal places, so is divided
by 228 to achieve the correct decimal value prior to performing the calculations and
multiplied by 228 at the end in order to shift the decimal accordingly. The values
for HS DIV and N1 are chosen from a selection of allowed values with the goal of
minimizing the DCO frequency (fdco) within an acceptable range, and also achieving
the lowest possible N1 and the highest HS DIV .
fxtal = (f0 ∗HS DIV ∗N1)/RFREQ (4.1)
fdco = f1 ∗HS DIV ∗N1 (4.2)
RFREQ = (fdco/fxtal) ∗ 228 (4.3)
Table 4.1: Register settings for Si570
Register Number Old Value (Hex) New Value (Hex)
13 22 A0
14 42 C3
15 BC 13
16 30 B7
17 EE 0C
18 FA D9
35
Figure 4.1: Factory Default Clock Register Settings for Si570
Figure 4.2: Preferred Clock Register Settings for Si570
In order to perform the programming of the device, an I2C master component
was utilized, provided by Scott Larson on EE Wiki [16]. A state machine was
devised to progress through each of the registers and start individual transactions
with the master driver. Following each write, a stop is sent, rather than continuously
writing in order to ensure that the correct register is written to each time. Since
all registers are written sequentially, this is not a strictly necessary course of action
and all registers could have been written in a streaming write sequence, but using
36
individual transactions ensures that a specific register receives the data designated
for it. This also set up the state machine in a useful manner for the clock generator,
which does not require programming of all registers. The code for the implemented
driver can be found in Appendix B under i2c driver.vhd.
Programmable Clock Generator
In order to further test the transceiver communication, two sets of transceivers
were required. Since the development board for the Arria V only contains one
set of SMA connectors and the Arria 10 engineering sample development board
does not contain any, a daughter card was fabricated to utilize the transceivers
through the FMC connector, with SMA connections. In order to achieve a viable
reference clock on the transceivers utilized by the daughter card, a clock generator
was included on the card along with the necessary circuitry. The VersaClock 6 Low
Power Programmable Clock Generator from Integrated Device Technology was chosen
because it is programmable over I2C, it has two configurable clock outputs, and it
has the option for four one-time programmable configurations stored in non-volatile
memory. The one-time programmable configurations are appealing in this project
because it does not require any setup once the configuration has been programmed;
the required frequency will be available on power-up of the device, unlike with the
oscillator on the development board.
Design Decisions
Many of the additional circuitry required by the clock generator is specified in the
datasheet, with recommendations such as using a 25 MHz crystal, and terminations
for different output configurations [17]. One of the design decision made includes
the connections of the I2C lines and the select line, pins 8, 9, and 24 respectively as
37
shown in Figure 4.3. Pin 24, OUT0 SEL I2CB is used to determine whether pins 8
and 9 will be select lines for one of the four stored configurations or the clock and
data lines for I2C communication. If connected to a pull-up resistor, they will be
select lines, otherwise, they will be used for I2C. Consequently, a pad was placed on
the PCB to enable a pull-up to be used, but it was not populated so the device could
be programmed over I2C. After power-up, this pin also serves as a clock output,
acting as a buffer for the selected reference clock [17]. Each of the clock outputs is
connected to a reference clock pin for the transceivers through the FMC connector
and one of them is also connected to the global clock network for use in FPGA logic,
if desired.
Figure 4.3: Diagram of Pin Assignments for VersaClock 6 Programmable Clock
Generator [17]
38
Registers
The VersaClock Clock Generator has registers programmable for four output
clocks, despite the fact that there are only two output clocks available on the device,
in addition to the reference clock output. The registers available to be programmed
include settings for the internal PLL divider and output dividers, both integer and
fractional. There is also the option to choose between the crystal reference and a
reference clock provided by the FPGA. The pins are shown in Figure 4.3. The registers
chosen to program include those for the programmable capacitors, the internal PLL
frequency dividers, and the output dividers. The values for the programmable tuning
capacitors were chosen based on Equation 4.4 [17] with an estimated combined stray
and external capacitance of 2 pF. Several values were tested to verify the values, but
there was not a large noticeable difference between any of the results, as seen on an
oscilloscope, so the originally designated values were kept. In choosing the values for
the PLL frequency dividers and the output frequency dividers, a voltage controlled
oscillator (VCO) frequency of 1250 MHz was targeted, which is the lower bound
of the desired range for the oscillator. Using this value with the known expected
output frequency of 156.25 MHz meant there was no fractional divider values for the
PLL or the output, which means fewer registers to program in addition to a more
accurate clock division. A MATLAB script was used to print out the desired register
configurations and the resulting VHDL code is included in Appendix A.
CL = (9pF + 0.5pF ∗XTAL[5 : 0] + Cs+ Ce)/2, (4.4)
Burning a Configuration
Unlike with the programmable oscillator on the development board, the clock
generator has the ability to hold four non-volatile configurations. The benefit of
39
using a non-volatile configuration, is that the clock output is available very soon after
power-up, without having to re-program the generator each time. In order to burn a
configuration, all the registers in RAM were set to the desired values, the VCO was
calibrated, and then the registers designated for control of the OTP were programmed
to define the registers to burn and then check to be sure that the burn completed
successfully. By setting bit 7 in the OTP Control register, the part will automatically
load data from OTP on power-up.
Utilizing the Clock Generator
With the configuration needed burned into the part and automatically loaded
on power-up, the only thing needed to ensure that the clock can be used by the
transceivers is to enable the output and select the appropriate reference clock. For
the default configuration burned, the default reference is the crystal input at 25 MHz.
The enable and select signals are both driven low on pins 6 and 7.
Altera IP
Within the Quartus software, Altera provides many different IP blocks as
”Megafunctions” that can be customized and dropped in a design. These make
handling transmission interfaces or creating memory blocks much simpler. However,
in our efforts to make the system as modular as possible, some of these had to be
bypassed and implemented by hand. Fortunately, the compiler will synthesize the
components and create the desired blocks even when not created in a megafunction.
One of the things that require care when writing the block by hand, however, is the
rules of the block. For instance, a dual port memory is very tricky to implement by
hand, as it cannot have arbitrary values on either side of the block. A benefit of
creating the unit within the Megawizard, is the tools will inform the user of valid
40
values for each of the parameters. Without this interface, users must carefully choose
their values or learn of a fail when the design is compiled. This was encountered in
the memory block instantiations used within the computation subsystem.
In order to avoid creating multiple different memory components and also in
an effort to create a modular design, a memory block component was created that
instantiates the Altera altsyncram megafunction with generic parameters that can be
input at the time of instantiation. This is a perfectly reasonable approach until a
port width ratio is violated. Rule violations happened several times over the course
of development and fixing them resulted in the creation of extra locations within the
memory block that were skipped on one port and ignored on the other, but required
to be there to enable the port ratios to work within the block. This is an unfortunate
waste of memory but not a huge concern for the design as it stands currently.
Timing Constraints
The hyperspectral camera is able to produce data at 6.6 Gbps with 500 fps when
using the full 1280x1024 pixel image. Having reduced the image size to 256 pixels for
this application, the frame rate is up to 2014 fps. After compilation of the parallel
data streams, it will be passed into the computation unit at 66 MHz. In order to
stay ahead of the incoming data, the computation unit needs a base clock at least
this fast, though preferably faster. Fortunately, faster is possible. The base clock in
the regression unit is targeted at 70 MHz. One of the tricks in running faster than
data is produced is to ensure that the blank times are not affecting the overall results,
since the unit is constantly adding in new values over each pixel. Therefore, a signal
was added to classify each incoming data chunk as valid or not. Using this signal,
the computation unit determines whether or not the value should be added into the
existing calculations. The valid signal is not, however, the antithesis to the error
41
signal passed by the camera block. Error is set when the incoming data is bad or the
location of the white/dark matrix values does not line up with the location of the
incoming data. In this case, the pixel currently being calculated is zeroed out and no
incoming data is considered until the start of the next pixel.
The design uses two primary clocks for the computations and classification, one
with a frequency three times faster than the other. The slower is required to keep up
with the incoming data rate. The triple speed clock was included when it was found
that the floating point adder and multiplier each take three cycles to complete. In the
original design of the inner product unit, a multiplier was pipelined with an adder, but
the result from the adder was needed as an input to the multiplier for the following
calculation. This was a carryover from a previous implementation which received data
from each spectral bin at a time, rather than from each pixel. Once the design was
changed to accommodate data arriving for a full pixel before moving on, the inner
product unit could also be changed. Quartus provides a multiply-accumulate floating
point megafunction that completes in four cycles. Using this, the faster clock was no
longer needed in this unit and data alignment was much easier. The faster clock was
kept, though, and utilized in the normalization step and combination of the parallel
data for the benefit of speed.
A challenge encountered in the timing requirements of the computation unit was
achieving the correct setup and hold timing for each of the clocks and a maximum
frequency of the clocks that is at least the desired run frequency. The Quartus
software contains a timing analyzer known as TimeQuest, which will check paths and
analyze timing requirements as well as providing statistics on each of the paths. It
will also provide some recommendations to help close timing, when possible. With
the first inner product unit design, TimeQuest found the faster clock with a maximum
rated frequency of 100 MHz less than where it needed to be in order to be triple the
42
speed of the other clock. This issue was the primary motivation for changing the
design of the inner product unit. The paths that were failing setup timing were all
related to the inner product and the Chip Planner, another tool within Quartus, was
used to show the paths that were being taken. In most cases, the path involved an
unnecessary stop at a register before passing back into the DSP block. By switching
out the adder and multiplier for the multiply-accumulate megafunction, there was a
significant decrease in required paths and registers. Therefore, routing was simpler
and clocks were not bouncing around nearly as much with fewer registers required.
There were a few changes made in order to accommodate the new architecture, but
it helped with timing immensely and functionality was verified in MATLAB. The
change allowed for the faster clock to have a maximum frequency up to 100 MHz
faster than its required speed and the slower clock also has a significant increase in
the ceiling for its speed. The setup timing failures were also removed with the removal
of the extra registers outside of the DSP blocks.
In analyzing the compilation results generated by Quartus, it was found that the
software was optimizing out several design-critical signals, including the data inputs
which caused much of the subsequent logic to also be optimized out. After issuing
a few changes to combat these optimizations, including fixing the parenthesization
around signal indices and utilizing the ’noprune’ attribute, new timing errors were
uncovered. This is one of the biggest tricks in working with software programs and
large projects. There are limitations to sizes of the inputs, and the optimizer will
remove seemingly unused signals. If the developer is unaware of these optimizations,
they could be placed in a false sense of security. Fortunately, this was discovered and
the work done to check if the removals were legitimate. Most of them actually were
because of extra space allocated in a signal that ends up never changing or remaining
unused.
43
The timing battle continues when the full design is compiled together. Not only
is the routing more challenging, new setup timing errors are uncovered because of
the routes taken. Though floorplanning was attempted, in most cases it actually
prevented the fitter from being able to fit the design. This simply continued the need
for timing analysis and tweaks to re-achieve minimal setup and hold errors in each of
the clocks.
Toolchain Fights
Some of the greatest frustration in implementation of the design, was simply in
figuring out how to work with and achieve the desired results from the tools utilized.
Oftentimes, it was a matter of tweaking settings in the software to display what you
want to see (and that is actually occurring), rather than a problem with the hardware
code that is being tested.
SignalTap
Altera provides an internal logic analyzer to watch signals in a design that could
not be reached from an external analyzer. This is helpful in debugging a design,
however since the logic analyzer uses device resources in the FPGA, anytime a change
is made in the analyzer, the design has to be re-compiled. Additionally, the extra
resource usage could make it a challenge for large designs. In this case, it is best for
modular designs when you can break out a portion to look at without requiring the
full design. This was a frequent problem encountered when debugging the part of the
project that communicates with the external DRAM because there was no other way
to look at the signals, and the particular signals that this project is passing in and
out of DRAM are 128 bits wide, so they each take up a lot of resources. The trick,
44
then, is to choose only the signals critically needed to be looked at and minimize the
size of the overall project to be scaled up after debugging is completed.
TimeQuest Timing Analyzer
Altera’s Timequest Timing Analyzer is both a useful tool and a nuisance. It is
helpful in predicting timing and showing what the maximum achievable frequency is.
However, it requires proper user input to help interpret how data is moving through
the design and how different clocks are related. Given the use of a clock that was
set to be three times faster than another clock and data that moved freely from one
clock’s domain to the other, interpretation for the tool was critical. Without it, the
setup and hold analysis had a total failure through the system of hundreds of seconds.
The needed input was the correct multicycle paths to tell the tool how to analyze
data that crosses clock domains between the system clock and the triple-speed clock.
By adding this information, the setup time error went from hundreds of seconds to
twenty seconds for the whole project. Further, upon changing the inner product unit,
the maximum frequency of the clocks was correctly above where they need to be and
the setup time error was completely mitigated. However, upon removal of some of the
optimizations that were compiling away needed registers, some of the setup timing
errors returned. These occurred mostly in the line scan camera side of the project,
as the object is ’built’. It is anticipated that utilizing the clock from the transceiver
block that actually corresponds to the line scan data will fix some of those errors.
Chip Planner
The Chip Planner utility provided with Quartus is extremely useful in visualizing
where resources are being used and the proximity of certain resources to each other.
It displays where each of the registers, DSP blocks, and I/O are being used for the
design after a compilation. It will also show data paths and can be linked to from
45
TimeQuest for viewing critical paths. A useful aspect that helps with timing is the
floorplanning feature. As the designer for the project, I was able to group signals
together and instruct the fitter to place them co-located. In doing so, the compile
time was decreased because paths were found easier and the clock speed increased.
This is not always the case, though. When a floorplanning technique from a project
containing only the regression step was applied to the project containing the full
computation unit, the fitter was unable to fit the design. This is likely due to the
significant increase in size of the overall project, so the additional resources and paths
prevented the use of the same techniques for fitting as were utilized in the smaller
project. Nevertheless, even grouping a small portion of the design together assisted
the fitter in finding placement for the whole design sooner and in a more efficient
manner than if it were to do it itself without clues as to the grouping. Grouping
for floorplans was particularly helpful in this project because of the way the design
is implemented. Due to the many generate statements used for working with the
parallel channels, it is helpful to the fitter to define what data is moving through each
path since as the user, that should be clear. In doing so, the fitter is able to try and
place the appropriate signals and data paths in proximity to related paths. A fitted
floor plan for the production Arria 10 development board is shown in Figure 4.4.
MATLAB
The design was verified using the HDL Verifier toolbox available within the
MATLAB tool set. In order to use this with the floating point computational blocks,
there are a few specific files that need to be included in a particular order to ensure
correct compilation with all libraries able to be located. Using this method, in
conjunction with ModelSim cosimulation, was useful in verifying the design, but not
easy to figure out at first. The HDL verifier is particularly useful in large projects
46
Figure 4.4: A fitted floor plan in the Arria 10.
such as this because it will provide inputs to the system and the outputs can be
compared with MATLAB calculations for easy verification. However, it was also
useful to have Modelsim running the design because it was sometimes easier to follow
the data path through each of the signals visually, rather than trying to pull out
the right information on the outputs. This was also a way to track internal signals
without having to port them out.
Verifying with MATLAB is an exercise in making sure that the functionality
of the design is fully understood. It has to be programmed in both the MATLAB
language as well as the hardware that you are testing. Because it is user code testing
user code, it is important that the desired functionality is fully understood and the
MATLAB code is believed to be correct. It is often necessary and useful to do a
couple of iterations by hand in order to assure oneself of the working nature of the
MATLAB program. When this does not happen, the debugging process is infinitely
more frustrating. This was experienced in testing the regression system. MATLAB
was used to provide the inputs and upon receiving an interrupt, it read the outputs
from the registers and wrote them to a file. The results in this file were compared
47
to those found by MATLAB on the same inputs to determine accuracy. At first, it
appeared to be working. Upon switching the regression to use an accumulator in the
inner product block, verification became somewhat dicey. Though the MATLAB code
did not change, the results from the VHDL could not be made to match it, and the
VHDL made sense. Upon closer inspection, it turned out that the MATLAB script
was calculating the inner product inaccurately and thus, previous results were also
inaccurately verified. The updated MATLAB script was verified by hand for a couple
pixels to assure users that it was indeed correct. With this change, the VHDL was
also verified to be accurate. This blunder provided an important lesson in verification
as it would not have been discovered if the inner product unit had not changed.
MATLAB was also used to generate the code for the register constants that
would be sent over I2C to the clock generator. By modifying a previously existing
script, the register definitions could be documented with the defaults and the desired
values. For any future changes, the user can simply change the values in the script
and re-generate the code. It generates a series of constant definitions to be pasted in
the VHDL file that controls the command transmission. The HDL Verifier was used
to verify functionality of the I2C driving state machine to ensure that the address,
register address, and data are sent and able to be acknowledged correctly.
Toolchain Tricks
As previously mentioned, it was often the case that timing or optimization errors
were the cause of misinterpretation by the tools of the desired design. Many of
the changes made included manipulating settings in the software to provide assisted
interpretation for the Quartus toolchain. These changes are detailed in this section.
In Quartus, attributes are used to assist the tools in interpretation and ensure
that particular conditions are kept in contrast to what might be readily perceived.
48
One of these attributes is ’noprune’. This is used to keep the synthesis analyzer
from removing a signal from the design. It is declared in the architecture prior to
the ’begin’ statement as a boolean. The boolean is then assigned to the appropriate
signal and set to ’true’. This was used in the object tracking file for the purpose of
ensuring that the tracking array was kept completely in the design. See Appendix B
for the object tracking.vhd file and the usage of ’noprune’.
An additional resource to assist in design compilations is the Compiler Settings
found in the Settings menu of Quartus. Within this section, there are Advanced
Settings available for both Synthesis and the Fitter. These settings were used
primarily when the design was having troubles fitting in the device. Some of the
changes made include, in the Fitter settings, changing the optimization technique to
optimize for speed, changing the fitter seed value (a random number, different from
the default, was used), setting the optimization mode to ’high performance effort’,
and setting the fitter aggressive routability optimization to ’always’. Many of these
settings default to ’automatically’ or ’off’ or if a range is possible, the default is the
middle option. Changing these settings alerts the tools to the user’s priorities in the
design and ensures that the maximum possible effort is placed in fitting the design to
the device. The changes made for this design were done to prioritize timing closure
regardless of increases in compile time or increased difficulty in fitting, so long as a
fit was achieved.
Using the Chip Planner to set Logic Lock regions is another useful way to
assist in the fitting of the device and optimizing for timing. Setting these regions
requires knowledge of the signals or resources that should be included in each region.
Incorrectly setting these could cause the fit to fail. Both scenarios were experienced
in the development of the computation system and the full system. However, the
49
regions were used to separate out the parallel resources for ease of interpretation by
the tools.
50
TEST AND VERIFICATION
Camera Interface
The interface responsible for taking the data from the camera, combining it
with data from the DRAM FIFO, and assigning location information was tested via
MATLAB cosimulation and Modelsim. This was done by simulating the data from
the camera with memory blocks per tap and assigning location information - verifying
that the locations were being assigned correctly. Subsequently, the DRAM interface
was added and the steps of writing to the DRAM and pulling from it in addition
to combining location information with the incoming camera data was tested and
verified using SignalTap. A couple of different scenarios were checked, such as if the
location from DRAM does not match the expected location corresponding to the
camera data location and the error flag needs to be set and all subsequent data can
be ignored until location zero is encountered again.
DRAM
The interface with the DRAM was primarily verified using the SignalTap Logic
Analyzer. An incrementing counter was written to the DRAM and then the same
space was read sequentially in a repetitive fashion to ensure that the memory
controller is functioning correctly. This was further verified with the use of the
buffer on the read side when combined with the camera interface. At the time of
this publication, the interaction between the HPS and the DRAM had not yet been
verified, though can be done by passing the values read on the FPGA back to the
HPS for comparison to the values originally written to the memory.
51
Computation Unit
The computation unit was developed and tested in sections. All sections were
verified using MATLAB cosimulation. First, each of the components within the
regression calculation were developed and tested individually. These are the inner
product unit and the normalization block with corresponding test bench scripts
inner product tb.m and normalize tb.m that can be found in Appendix C. The
functionality of the megafunction which converts the fixed point numbers to floating
point was verified with the normalization block. Testing incrementally in this way
was also used to assist in the development of the component as a whole as it relies
on knowledge of the latency through each block to trigger some signals, such as the
signal indicating that a new pixel is beginning in the inner product or a result is
ready on the output. The inner product block was tested with the normalization by
inputting the values from MATLAB on the input ports and using the known latencies
to verify the output before the full unit was tested as it is expected to be used. This
means utilizing the Avalon memory mapped interface to read and write registers and
accessing the results from memory after triggered by an interrupt. This interrupt was
later moved in the full computation unit to be utilized for a different memory block.
The full verification of the regression was completed by writing the class coefficients,
mean and inverted standard deviation values to memory and piping in the input
values after setting the enable bit and interrupt enable bit. Upon completion of the
image matrix, the test bench spins on the interrupt until it is set. At this point, it
reads from the results memory block. The results read from the system are written
to a spreadsheet along with the expected values, as calculated in MATLAB, and
compared for accuracy. After satisfactory completion of this test, a full frame of a
52
small image is tested to verify that accurate results are obtained for each line in an
image. The test bench file for this verification is regression tb.m (see Appendix C).
Other components of the computation unit verified individually in MATLAB
include the sorting block and the object classification block. The sorting block was
verified by reading the print out of sorted results to visually check that they are sorted,
and then checking that the indices line up with the sorted results (see sort tb.m in
Appendix C). The Modelsim output was analyzed to verify the expected two clock
cycle latency for sorting. This verification was also useful in determining the order
in which the elements are sorted, whether from least to greatest or vice versa so
as to correctly interpret the results internally to the computation unit. The object
classification block was verified by creating a few sample objects in Paint that are
simply black on a white background for a clear distinction. The image was read into
MATLAB and the resulting data was used as the simulated transmission from the
monochrome line scan camera. A small section was used to check that the correct
classification results were being compiled over the object and a definitive answer was
correctly given at the end of the object (see objects tb.m in Appendix C). Originally
developed within the object classification block is a component which converts the
monochrome pixel number to the hyperspectral pixel number. This was also verified
individually in MATLAB using camera ratios tb.m (see Appendix C) by generating
a plot to relate the hyperspectral pixel numbers with the monochrome pixel numbers
as shown in Figure 5.1. Figure 5.2 shows a section of the same plot, depicting the
nature of the relations. This component was moved to the Arria V side of the system
to alleviate the need for an excessive number of transmissions.
Due to the nature of having inputs running on several different clocks, testing of
the full computation unit from camera interface input to results of object classification
has not been performed in simulation, but with each of the components working as
53
Figure 5.1: Generated plot depicting ratios between the pixels of the line scan camera
and the pixels of the hyperspectral camera.
Figure 5.2: Generated plot depicting ratios between the pixels of the line scan camera
and the pixels of the hyperspectral camera, zoomed in for greater detail.
expected, the author is confident in the full system functionality. This will be tested
further as development continues.
54
FPGA to FPGA Transmission
The interface using the transceivers was verified by first sending information
between two different transceivers on the same board, before trying to link the two
boards together. The packet generator was sending counter values and the checker
was looking for counter values independently enabling this same system to be used
when transmitting between the two boards. The packet generator and checker systems
were provided in a design example from the Altera Wiki [18]. Signal Tap was used to
check error signals from the pattern checker and the SerialLite2 core. In verifying the
transmission between boards, different transmission speeds were tested, including the
maximum rate that the Arria V can support, 6.5536 Gbps. At this rate, there were
significant errors in the transmission as bits flipped. The goal was to have at least 6
Gbps and this was achieved with minimal errors at a transmission rate of 6250 Mbps,
or 6.250 Gbps. The high-level files relevant for this testing are:
• a10 com.vhd
– xcvr core.vhd
∗ a10 phy.vhd
∗ sl2 core.vhd
∗ xcvr pll.vhd
• packet generate.vhd
• packet verify.vhd
The xcvr core.vhd file can be found in Appendix B, as can the top-level a10 com.vhd
file. The others were either provided by the design example or generated in the
Megawizard for use in the project. A similar structure was used for testing on the
55
Arria V and the top-level file for that project (a5 com.vhd) can also be found in
Appendix B.
56
CONCLUSION
A dynamic and powerful real-time image processing system is being developed
on an FPGA for application in sorting systems. The Arria 10 FPGA is utilized for
its high speed transceivers in addition to its hardened floating point DSP blocks and
hardened memory controllers. Development of the system in VHDL enables the use
of generic parameters for possible changes in the camera front-end to the system.
In doing so, the system is modular and can be utilized in various spaces. Test
and verification of the system has been performed using tools provided by Altera
and MathWorks to test individual subsystems as well as various combinations of
subsystems. Further development and testing will be required as the hardware is
developed and put in place for actual camera interactions with the FPGA. The
prototype developed demonstrates the benefit of floating point calculations in an
FPGA for real-time processing. Techniques utilized here can be taken for use in a
custom-built board on which a single smart camera system can reside.
57
REFERENCES CITED
58
[1] R. Snider, “Unpublished proposal in response to the montana board of research
and commercialization technology request for proposals, research and commer-
cialization projects, fiscal year 2016 guidelines,” 2015, unpublished.
[2] “What is spectral imaging and when should i use it?” White Paper, Resonon.
[3] G. Lokman and G. Yilmaz, “Hyperspectral image classification using support
vector neural network algorithm,” pp. 239–243, 2015.
[4] (2016) Food sorting machines market: Global industry analysis and opportunity
assessment 2015-2025. Future Market Insights. 616 Corporate Way, Valley
Cottage, NY 10989. [Online]. Available: http://www.futuremarketinsights.com/
reports/food-sorting-machines-market
[5] “mvbluegemini technical details,” Matrix Vision GmbH, Talstrasse 16, 71570
Oppenweiler, 2016.
[6] “mvbluelynx-x technical details,” Matrix Vision GmbH, Talstrasse 16, 71570
Oppenweiler, 2014.
[7] (2016) Matrox iris gt with matrox design assistant 4. [Online]. Available: http:
//www.matrox.com/imaging/media/pdf/products/iris gt da/iris gt da.pdf
[8] (2015) Razercam: Highspeed smart kamera for machine vision. Eye Vision
Technology. 76131 Karlsruhe Germany. [Online]. Available: http://www.
evt-web.com/fileadmin/img/products/RazerCam/RazerCam 15 EN V004.pdf
[9] “Boa smart vision system,” Teledyne DALSA, 2013.
[10] “Camera link technology brief,” Basler Vision Technologies, 2001.
[11] (2016) Arria 10 socs: Features. Altera Corporation, now part of Intel.
101 Innovation Drive, San Jose, CA 95134. [Online]. Available: https:
//www.altera.com/products/soc/portfolio/arria-10-soc/features.html
[12] M. Parker, “Understanding peak floating-point performance claims,” Altera
Corporation, June 2014.
[13] (2015) Altera’s 30 billion transistor fpga. Gazettabyte.
[Online]. Available: http://www.gazettabyte.com/home/2015/6/28/
alteras-30-billion-transistor-fpga.html
[14] (2015) Pads. Computer Software. Mentor Graphics. [Online]. Available:
https://www.pads.com
[15] “Si570/si571 data sheet: 10 mhz to 1.4 ghz i2c programmable xo/vcxo,” Silicon
Labs, 400 West Cesar Chavez, Austin, TX 78701, 2014.
59
[16] S. Larson. (2015) EE Wiki. Version 2.2. [Online]. Available: https:
//eewiki.net/pages/viewpage.action?pageId=10125324
[17] “Programmable clock generator 5p49v6913 datasheet,” IDT, 6024 Silver Creek
Valley Road, San Jose, CA 95138, 2015, revision C.
[18] (2015) Using seriallite ii ip on arria 10 devices. Altera Wiki. [Online]. Available:
http://www.alterawiki.com/wiki/Using SerialLite II IP on Arria 10 devices
60
APPENDICES
61
APPENDIX A
REGISTER DESCRIPTIONS
62
Computation Unit Registers
Table A.1: ENABLE Register Description
MSB ENABLE (Block Offset = 0x0, Register Offset = 0x0) LSB
Bits 31 28 27 24 23 20 19 16 15 12 11 8 7 4 3 0
R/W - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - I
Reset 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Table A.2: IRQ ENABLE Register Description
MSB IRQ ENABLE (Block Offset = 0x0, Register Offset = 0x4) LSB
Bits 31 28 27 24 23 20 19 16 15 12 11 8 7 4 3 0
R/W - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - I
Reset 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Table A.3: IRQ PENDING Register Description
MSB IRQ PENDING (Block Offset = 0x0, Register Offset = 0x8) LSB
Bits 31 28 27 24 23 20 19 16 15 12 11 8 7 4 3 0
R/W - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - I
Reset 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
63
Table A.4: NUM BINS Register Description
MSB NUM BINS (Block Offset = 0x100, Register Offset = 0x0) LSB
Bits 31 28 27 24 23 20 19 16 15 12 11 8 7 4 3 0
R/W - - - - - - - - - - - - - - - - - - - - - - - - I I I I I I I I
Reset 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0
Table A.5: NUM PIXELS Register Description
MSB NUM PIXELS (Block Offset = 0x100, Register Offset = 0x4) LSB
Bits 31 28 27 24 23 20 19 16 15 12 11 8 7 4 3 0
R/W - - - - - - - - - - - - - - - - - - - - - - I I I I I I I I I I
Reset 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
Table A.6: NUM CLASSES Register Description
MSB NUM CLASSES (Block Offset = 0x100, Register Offset = 0x8) LSB
Bits 31 28 27 24 23 20 19 16 15 12 11 8 7 4 3 0
R/W - - - - - - - - - - - - - - - - - - - - - - - - - - - I I I I I
Reset 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 0
Table A.7: FRAME COUNT Register Description
MSB FRAME COUNT (Block Offset = 0x100, Register Offset = 0xC) LSB
Bits 31 28 27 24 23 20 19 16 15 12 11 8 7 4 3 0
R/W I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I
Reset 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Table A.8: MEAN Register Description
MSB MEAN (Block Offset = 0x1000 LSB
Bits 31 28 27 24 23 20 19 16 15 12 11 8 7 4 3 0
R/W S E E E E E E E E F F F F F F F F F F F F F F F F F F F F F F F
Table A.9: STD DEV I Register Description
MSB STD DEV I (Block Offset = 0x4000 LSB
Bits 31 28 27 24 23 20 19 16 15 12 11 8 7 4 3 0
R/W S E E E E E E E E F F F F F F F F F F F F F F F F F F F F F F F
Table A.10: COEFFICIENT Register Description
MSB COEFFICIENT (Block Offset = 0x100000 LSB
Bits 31 28 27 24 23 20 19 16 15 12 11 8 7 4 3 0
R/W S E E E E E E E E F F F F F F F F F F F F F F F F F F F F F F F
64
Table A.11: INNER PRODUCT Register Description
MSB INNER PRODUCT (Block Offset = 0x200000 LSB
Bits 31 28 27 24 23 20 19 16 15 12 11 8 7 4 3 0
R/W S E E E E E E E E F F F F F F F F F F F F F F F F F F F F F F F
Table A.12: DECISION VECTOR Register Description
MSB DECISION VECTOR (Block Offset = 0x300000 LSB
Bits 31 28 27 24 23 20 19 16 15 12 11 8 7 4 3 0
R/W S E E E E E E E E F F F F F F F F F F F F F F F F F F F F F F F
65
APPENDIX B
VHDL CODE
66
1 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 −−
3 −−! @ f i l e i 2 c d r i v e r . vhd
4 −−! @br ie f Contro l s programming o f c l o c k genera tor over i2c
5 −−! @de t a i l s Contains s t a t e machine f o r programming r e g i s t e r s
in
6 −−! VersaClock c l o c k genera tor .
7 −−! @author Monica Whitaker
8 −−! @date September 2015
9 −−! @copyright Copyright (C) 2015 Ross K. Snider and Monica
Whitaker
10 −−
11 −− This program i s f r e e so f tware : you can r e d i s t r i b u t e i t and/or
modify
12 −− i t under the terms o f the GNU General Pub l i c License as
pub l i s h ed by
13 −− the Free Sof tware Foundation , e i t h e r ve r s i on 3 o f the License
, or
14 −− ( a t your opt ion ) any l a t e r ve r s i on .
15 −−
16 −− This program i s d i s t r i b u t e d in the hope t ha t i t w i l l be
u s e fu l ,
17 −− but WITHOUT ANY WARRANTY; wi thout even the imp l i ed warranty
o f
18 −− MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
19 −− GNU General Pub l i c License f o r more d e t a i l s .
20 −−
21 −− You shou ld have r e c e i v ed a copy o f the GNU General Pub l i c
License
22 −− a long wi th t h i s program . I f not , see <h t t p ://www. gnu . org /
l i c e n s e s />.
23 −−
24 −− Monica Whitaker
25 −− E l e c t r i c a l and Computer Engineer ing
26 −− Montana S ta t e Un i v e r s i t y
27 −− 610 Cob le i gh Ha l l
28 −− Bozeman , MT 59717
29 −− monica . whitaker@msu . montana . edu
30 −−
31 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
32 l ibrary IEEE ; −−! Use standard l i b r a r y .
33 use IEEE . STD LOGIC 1164 .ALL; −−! Use standard l o g i c e lements .
34 use IEEE .NUMERIC STD.ALL; −−! Use numeric s tandard .
35 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
36 −−
37 −−! @br ie f i 2 c d r i v e r
67
38 −−! @de t a i l s Contains s t a t e machine f o r programming r e g i s t e r s
in
39 −−! VersaClock c l o c k genera tor .
40 −−! @param c l k System c l o c k
41 −−! @param re s e t n Reset s i g n a l
42 −−! @param enab l e Enable s t a r t i n g s t a t e machine
43 −−! @param i 2 c s c l Clock l i n e
44 −−! @param i2c sda bi−d i r e c t i o n a l data l i n e
45 −−! @param error I2C communication error
46 −−! @param done Ind i c a t e s s t a t e machine complete
47 −−! @param burn succes s S ta tus s i g n a l a f t e r burning
con f i g u ra t i on
48 −−
49 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
50 entity i 2 c d r i v e r i s
51 port (
52 c l k : in s t d l o g i c ;
53 r e s e t n : in s t d l o g i c ;
54 enable : in s t d l o g i c ;
55
56 i 2 c s c l : inout s t d l o g i c ;
57 i 2 c s da : inout s t d l o g i c ;
58
59 e r r o r : out s t d l o g i c ;
60
61 done : out s t d l o g i c ;
62 burn succe s s : out s t d l o g i c
63 ) ;
64 end entity ;
65
66 architecture arch of i 2 c d r i v e r i s
67 component i 2 c mas t e r i s
68 GENERIC(
69 i npu t c l k : INTEGER := 50 000 000 ; −−input c l o c k
speed (Hz)
70 bus c l k : INTEGER := 400 000 ) ; −−speed o f s c l (Hz)
71 PORT(
72 c l k : IN STD LOGIC;
73 r e s e t n : IN STD LOGIC;
74 ena : IN STD LOGIC;
75 addr : IN STD LOGIC VECTOR(6 DOWNTO 0) ;
76 rw : IN STD LOGIC;
77 data wr : IN STD LOGIC VECTOR(7 DOWNTO 0) ;
78 busy : OUT STD LOGIC;
79 data rd : OUT STD LOGIC VECTOR(7 DOWNTO 0) ;
80 a ck e r r o r : BUFFER STD LOGIC;
81 sda : INOUT STD LOGIC;
68
82 s c l : INOUT STD LOGIC
83 ) ;
84 end component ;
85
86 −−address o f Clock Generator dev i c e
87 −−xD4 (xD5 to read )
88 constant addres s dev : s t d l o g i c v e c t o r (7 downto 0) :=
89 "11010100" ;
90
91 −−CONFIGURATION 0 HAS BEEN BURNED! !
92 −−CHANGE Burn Reg i s t e r s f o r f u r t h e r burns
93
94 −− Reg00 Name: RAM0 00
95 −− Reg00 Descr ip t i on : OTP Contro l
96 −− Hex Address = 00
97 −− Defau l t = x”FF”
98 constant Reg00 Addr : s t d l o g i c v e c t o r (7 downto 0) :=
99 "00000000" ;
100 constant Reg00 Data : s t d l o g i c v e c t o r (7 downto 0) :=
101 "01100001" ;
102 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
103 −− Reg01 Name: RAM1 XTAL1
104 −− Reg01 Descr ip t i on : X1 Load Capaci tor
105 −− Hex Address = 12
106 −− Defau l t = 00000001
107 constant Reg01 Addr : s t d l o g i c v e c t o r (7 downto 0) :=
108 "00010010" ;
109 constant Reg01 Data : s t d l o g i c v e c t o r (7 downto 0) :=
110 "00101001" ;
111 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
112 −− Reg02 Name: RAM1 XTAL2
113 −− Reg02 Descr ip t i on : Factory Reserved
114 −− Hex Address = 13
115 −− Defau l t = 00000000
116 constant Reg02 Addr : s t d l o g i c v e c t o r (7 downto 0) :=
117 "00010011" ;
118 constant Reg02 Data : s t d l o g i c v e c t o r (7 downto 0) :=
119 "00101000" ;
120 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
121 −− Reg03 Name: RAM1 Feedback
122 −− Reg03 Descr ip t i on : Feedback In t e g e r Div ider (PLL)
123 −− Hex Address = 17
124 −− Defau l t = 00000011
125 constant Reg03 Addr : s t d l o g i c v e c t o r (7 downto 0) :=
126 "00010111" ;
127 constant Reg03 Data : s t d l o g i c v e c t o r (7 downto 0) :=
128 "00000110" ;
69
129 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
130 −− Reg04 Name: RAM1 Feedback
131 −− Reg04 Descr ip t i on : Feedback In t e g e r Div ider Bi t s (PLL)
132 −− Hex Address = 18
133 −− Defau l t = 00000000
134 constant Reg04 Addr : s t d l o g i c v e c t o r (7 downto 0) :=
135 "00011000" ;
136 constant Reg04 Data : s t d l o g i c v e c t o r (7 downto 0) :=
137 "01000000" ;
138 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
139 −− Reg05 Name: RAM2 2E
140 −− Reg05 Descr ip t i on : Output Div ider In t e g e r 2
141 −− Hex Address = 2e
142 −− Defau l t = 11100000
143 constant Reg05 Addr : s t d l o g i c v e c t o r (7 downto 0) :=
144 "00101110" ;
145 constant Reg05 Data : s t d l o g i c v e c t o r (7 downto 0) :=
146 "10100000" ;
147 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
148 −− Reg06 Name: RAM6 60
149 −− Reg06 Descr ip t i on : Clock1 Output Config
150 −− Hex Address = 60
151 −− Defau l t = 10111011
152 constant Reg06 Addr : s t d l o g i c v e c t o r (7 downto 0) :=
153 "01100000" ;
154 constant Reg06 Data : s t d l o g i c v e c t o r (7 downto 0) :=
155 "01111011" ;
156 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
157 −− Reg07 Name: RAM1 1D
158 −− Reg07 Descr ip t i on : VCO Monitoring
159 −− Hex Address = 1D
160 −− Defau l t = 01101111
161 constant Reg07 Addr : s t d l o g i c v e c t o r (7 downto 0) :=
162 "00011101" ;
163 constant Reg07 Data : s t d l o g i c v e c t o r (7 downto 0) :=
164 "01001101" ;
165 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
166 −− Reg08 Name: RAM1 1E
167 −− Reg08 Descr ip t i on : RC Contro l Reg i s t e r
168 −− Hex Address = 1E
169 −− Defau l t = 00000000
170 constant Reg08 Addr : s t d l o g i c v e c t o r (7 downto 0) :=
171 "00011110" ;
172 constant Reg08 Data : s t d l o g i c v e c t o r (7 downto 0) :=
173 "10010010" ;
174 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
175 −− Reg09 Name: RAM1 1F
70
176 −− Reg09 Descr ip t i on : RC Contro l Reg i s t e r
177 −− Hex Address = 1F
178 −− Defau l t = 00110010
179 constant Reg09 Addr : s t d l o g i c v e c t o r (7 downto 0) :=
180 "00011111" ;
181 constant Reg09 Data : s t d l o g i c v e c t o r (7 downto 0) :=
182 "00110010" ;
183 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
184 −− BURN REG1 Name: User S t a r t Address [ 8 : 0 ]
185 −− Descr ip t i on : Part−Se l e c t Bi t
186 −− Hex Address = 73
187 constant Burn Reg1 Addr : s t d l o g i c v e c t o r (7 downto 0) :=
188 "01110011" ;
189 constant Burn Reg1 Data : s t d l o g i c v e c t o r (7 downto 0) :=
190 "00000000" ;
191 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
192 −− BURN REG2 Name: CFG0 Test b l o c k enab l e
193 −− Descr ip t i on : Enable Sub−b lock ’ s Test Mode
194 −− Hex Address = 74
195 constant Burn Reg2 Addr : s t d l o g i c v e c t o r (7 downto 0) :=
196 "01110100" ;
197 constant Burn Reg2 Data : s t d l o g i c v e c t o r (7 downto 0) :=
198 "01001110" ;
199 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
200 −− BURN REG3 Name: User End Address [ 8 : 0 ]
201 −− Descr ip t i on : Part−Se l e c t Bi t
202 −− Hex Address = 75
203 constant Burn Reg3 Addr : s t d l o g i c v e c t o r (7 downto 0) :=
204 "01110101" ;
205 constant Burn Reg3 Data : s t d l o g i c v e c t o r (7 downto 0) :=
206 "00110100" ;
207 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
208 −− BURN REG4 Name: User End Address
209 −− Descr ip t i on : Part−Se l e c t Bi t
210 −− Hex Address = 76
211 constant Burn Reg4 Addr : s t d l o g i c v e c t o r (7 downto 0) :=
212 "01110110" ;
213 constant Burn Reg4 Data : s t d l o g i c v e c t o r (7 downto 0) :=
214 "11100001" ;
215 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
216 −− BURN REG5 Name: Burned Reg i s t e r S t a r t Address
217 −− Descr ip t i on : Burned r e g i s t e r s t a r t address
218 −− Hex Address = 77
219 constant Burn Reg5 Addr : s t d l o g i c v e c t o r (7 downto 0) :=
220 "01110111" ;
221 constant Burn Reg5 Data : s t d l o g i c v e c t o r (7 downto 0) :=
222 "00000000" ;
71
223 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
224 −− BURN REG6 Name: Read Reg i s t e r S t a r t Address
225 −− Descr ip t i on : Read r e g i s t e r s t a r t address
226 −− Hex Address = 78
227 constant Burn Reg6 Addr : s t d l o g i c v e c t o r (7 downto 0) :=
228 "01111000" ;
229 constant Burn Reg6 Data : s t d l o g i c v e c t o r (7 downto 0) :=
230 "00000000" ;
231 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
232
233 type s t a t e t yp e i s ( next cmd , send cmd ) ;
234 signal s t a t e : s t a t e t yp e ;
235 signal cmd cnt : i n t e g e r range 0 to 16 ;−−35;
236 signal end count , c a l c ount : i n t e g e r range 0 to 5000000;
237 signal burn count : i n t e g e r range 0 to 25000000;
238
239 signal i 2c ena , i2c rw , i2c busy , i 2 c a c k e r r o r , busy prev :
s t d l o g i c ;
240 signal i 2c addr , s l ave addr : s t d l o g i c v e c t o r (6 downto 0) ;
241 signal i 2 c da ta rd , i2c data wr , reg addr , reg data , data :
242 s t d l o g i c v e c t o r (7 downto 0) ;
243 signal vco va l : s t d l o g i c v e c t o r (4 downto 0) ;
244 signal rw : s t d l o g i c ;
245
246 begin
247
248 i 2 c i o : i 2 c mas t e r
249 generic map(
250 i npu t c l k => 50 000 000 ,
251 bus c l k => 400 000 )
252 port map(
253 c l k => c lk ,
254 r e s e t n => r e s e t n ,
255 ena => i 2c ena ,
256 addr => i 2c addr ,
257 rw => i2c rw ,
258 data wr => i 2c data wr ,
259 busy => i 2c busy ,
260 data rd => i 2 c da ta rd ,
261 a ck e r r o r => i 2 c a c k e r r o r ,
262 sda => i 2 c sda ,
263 s c l => i 2 c s c l
264 ) ;
265
266 process ( c lk , r e s e t n )
267 variable busy cnt : i n t e g e r range 0 to 2 ;
268 begin
72
269 i f ( r e s e t n = ’0 ’ ) then
270 busy cnt := 0 ;
271 done <= ’0 ’ ;
272 s t a t e <= next cmd ;
273 i 2 c ena <= ’0 ’ ;
274 end count <= 0 ;
275 ca l c ount <= 0 ;
276 cmd cnt <= 0 ;
277 e r r o r <= ’0 ’ ;
278 burn count <= 0 ;
279 e l s i f ( r i s i n g e d g e ( c l k ) ) then
280 i f ( enable = ’1 ’ ) then
281 case s t a t e i s
282 when send cmd =>
283 −− l a t c h busy s i g n a l
284 busy prev <= i2c busy ;
285 i f ( busy prev = ’0 ’ and i 2 c busy = ’1 ’ )
then
286 busy cnt := busy cnt + 1 ;
287 end i f ;
288
289 case busy cnt i s
290 when 0 =>
291 i 2 c ena <= ’1 ’ ;
292 i 2 c addr <= s lave addr ;
293 −−always wr i t e f i r s t
294 i 2 c rw <= ’0 ’ ;
295 i 2 c da ta wr <= reg addr ;
296 when 1 =>
297 i f ( rw = ’1 ’ ) then
298 −− i f reading , do so
299 i 2 c rw <= rw ;
300 else −−otherwise , wr i t e data
301 i 2 c da ta wr <= reg data ;
302 end i f ;
303 when 2 =>
304 i 2 c ena <= ’0 ’ ;
305 i f ( i 2 c busy = ’0 ’ ) then
306 −−c o l l e c t data read
307 data <= i2 c da t a rd ;
308 busy cnt := 0 ;
309 s t a t e <= next cmd ;
310 end i f ;
311 end case ;
312
313 when next cmd =>
314 −−Burn process has known p o s s i b i l i t y
73
315 −−o f NAK
316 −− i f ( i 2 c a c k e r r o r = ’1 ’ and cmd cnt /= 0)
317 −− then
318 −− −−s t a t e <= send cmd ;
319 −− error <= ’1 ’ ;
320 −− e l s e
321 case cmd cnt i s
322 when 0 =>
323 s l ave addr <= address dev (7
downto 1) ;
324 rw <= ’0 ’ ; −−wr i t e
325 reg addr <= Reg01 Addr ;
326 r eg data <= Reg01 Data ;
327 cmd cnt <= 1 ;
328 s t a t e <= send cmd ;
329 when 1 =>
330 reg addr <= Reg02 Addr ;
331 r eg data <= Reg02 Data ;
332 cmd cnt <= 2 ;
333 s t a t e <= send cmd ;
334 when 2 =>
335 reg addr <= Reg03 Addr ;
336 r eg data <= Reg03 Data ;
337 cmd cnt <= 3 ;
338 s t a t e <= send cmd ;
339 when 3 =>
340 reg addr <= Reg04 Addr ;
341 r eg data <= Reg04 Data ;
342 cmd cnt <= 4 ;
343 s t a t e <= send cmd ;
344 when 4 =>
345 reg addr <= Reg05 Addr ;
346 r eg data <= Reg05 Data ;
347 cmd cnt <= 5 ;
348 s t a t e <= send cmd ;
349 when 5 =>
350 reg addr <= Reg06 Addr ;
351 r eg data <= Reg06 Data ;
352 cmd cnt <= 6 ;
353 s t a t e <= send cmd ;
354 when 6 =>
355 reg addr <= Reg07 Addr ;
356 r eg data <= Reg07 Data ;
357 cmd cnt <= 7 ;
358 s t a t e <= send cmd ;
359 when 7 =>
360 reg addr <= Reg08 Addr ;
74
361 r eg data <= Reg08 Data ;
362 cmd cnt <= 8 ;
363 s t a t e <= send cmd ;
364 when 8 =>
365 reg addr <= Reg09 Addr ;
366 r eg data <= Reg09 Data ;
367 cmd cnt <= 9 ;
368 s t a t e <= send cmd ;
369 when 9 => −−Begin VCO ca l i b r a t i o n
370 reg addr <= x"11" ;
371 r eg data <= "00001100" ;
372 cmd cnt <= 10 ;
373 s t a t e <= send cmd ;
374 when 10 => −−t o g g l e b i t 7
375 reg addr <= x"1C" ;
376 r eg data <= "00000101" ;
377 rw <= ’0 ’ ;
378 cmd cnt <= 11 ;
379 s t a t e <= send cmd ;
380 when 11 =>
381 reg addr <= x"1C" ;
382 r eg data <= "10000101" ;
383 cmd cnt <= 12 ;
384 s t a t e <= send cmd ;
385 when 12 =>
386 reg addr <= x"1C" ;
387 r eg data <= "00000101" ;
388 cmd cnt <= 13 ;
389 s t a t e <= send cmd ;
390 when 13 => −−wai t 100ms
391 i f ( ca l c ount = 5000000) then
392 ca l c ount <= 0 ;
393 reg addr <= x"99" ;
394 rw <= ’1 ’ ; −−
read
395 cmd cnt <= 14 ;
396 s t a t e <= send cmd ;
397 else
398 ca l c ount <= ca l count +
1 ;
399 end i f ;
400 when 14 =>
401 −−s t o r e data read from
r e g i s t e r
402 vco va l <= data (7 downto 3) ;
403 cmd cnt <= 15 ;
404 when 15 =>
75
405 i f ( unsigned ( vco va l ) /=
406 to uns igned (23 ,5 ) and
407 unsigned ( vco va l ) /=
408 to uns igned (0 , 5 ) ) then
409 −−f o r c e VCO va lue
410 reg addr <= x"11" ;
411 rw <= ’0 ’ ;
412 r eg data <= "001" &
vco va l ;
413 cmd cnt <= 16 ;
414 s t a t e <= send cmd ;
415 else
416 cmd cnt <= 10 ; −−repea t
c a l i b r a t i o n
417 end i f ;
418 −−ONLY used f o r burning con f i gura t i on−−
419 −− when 16 =>
420 −− reg addr <= Reg00 Addr ;
421 −− r e g da ta <= Reg00 Data ;
422 −− cmd cnt <= 17;
423 −− s t a t e <= send cmd ;
424 −− when 17 => −−s e t up burn
r e g i s t e r s
425 −− reg addr <= Burn Reg1 Addr
;
426 −− r e g da ta <= Burn Reg1 Data
;
427 −− cmd cnt <= 18;
428 −− s t a t e <= send cmd ;
429 −− when 18 =>
430 −− reg addr <= Burn Reg2 Addr
;
431 −− r e g da ta <= Burn Reg2 Data
;
432 −− cmd cnt <= 19;
433 −− s t a t e <= send cmd ;
434 −− when 19 =>
435 −− reg addr <= Burn Reg3 Addr
;
436 −− r e g da ta <= Burn Reg3 Data
;
437 −− cmd cnt <= 20;
438 −− s t a t e <= send cmd ;
439 −− when 20 =>
440 −− reg addr <= Burn Reg4 Addr
;
76
441 −− r e g da ta <= Burn Reg4 Data
;
442 −− cmd cnt <= 21;
443 −− s t a t e <= send cmd ;
444 −− when 21 =>
445 −− reg addr <= Burn Reg5 Addr
;
446 −− r e g da ta <= Burn Reg5 Data
;
447 −− cmd cnt <= 22;
448 −− s t a t e <= send cmd ;
449 −− when 22 =>
450 −− reg addr <= Burn Reg6 Addr
;
451 −− r e g da ta <= Burn Reg6 Data
;
452 −− cmd cnt <= 23;
453 −− s t a t e <= send cmd ;
454 −− when 23 =>
455 −− −−wai t 100ms
456 −− i f ( end count = 5000000)
then
457 −− cmd cnt <= 24;
458 −− e l s e
459 −− end count <= end count
+ 1;
460 −− end i f ;
461 −− when 24 => −−s t a r t burn proces s
462 −− reg addr <= x ”72”;
463 −− r e g da ta <= x”F0”;
464 −− cmd cnt <= 25;
465 −− s t a t e <= send cmd ;
466 −− when 25 =>
467 −− reg addr <= x ”72”;
468 −− r e g da ta <= x”F8”;
469 −− cmd cnt <= 26;
470 −− s t a t e <= send cmd ;
471 −− when 26 =>
472 −− −−wai t 500ms
473 −− i f ( burn count = 25000000)
then
474 −− cmd cnt <= 27;
475 −− burn count <= 0;
476 −− e l s e
477 −− burn count <=
burn count + 1;
478 −− end i f ;
77
479 −− when 27 =>
480 −− reg addr <= x ”72”;
481 −− r e g da ta <= x”F0”;
482 −− cmd cnt <= 28;
483 −− s t a t e <= send cmd ;
484 −− when 28 =>
485 −− reg addr <= x ”72”;
486 −− r e g da ta <= x”F8”;
487 −− cmd cnt <= 29;
488 −− s t a t e <= send cmd ;
489 −− when 29 =>
490 −− −−wai t 500ms
491 −− i f ( burn count = 25000000)
then
492 −− reg addr <= x ”72”;
493 −− r e g da ta <= x”F0”;
494 −− s t a t e <= send cmd ;
495 −− cmd cnt <= 30;
496 −− e l s e
497 −− burn count <=
burn count + 1;
498 −− end i f ;
499 −− when 30 => −−margin read
500 −− reg addr <= x ”72”;
501 −− r e g da ta <= x”F2”;
502 −− cmd cnt <= 31;
503 −− s t a t e <= send cmd ;
504 −− when 31 =>
505 −− reg addr <= x ”72”;
506 −− r e g da ta <= x”F0”;
507 −− cmd cnt <= 32;
508 −− s t a t e <= send cmd ;
509 −− when 32 => −− t e s t i f s u c c e s s f u l
510 −− reg addr <= x”9F”;
511 −− rw <= ’1 ’ ;
512 −− cmd cnt <= 33;
513 −− s t a t e <= send cmd ;
514 −− when 33 =>
515 −− i f ( data (1) = ’1 ’ ) then
516 −− error <= ’1 ’ ;
517 −− e l s e
518 −− burn succes s <= ’1 ’ ;
519 −− end i f ;
520 −− cmd cnt <= 34;
521 −− when 34 => −−r e s e t r e g i s t e r
522 −− reg addr <= x”9F”;
523 −− r e g da ta <= x ”00”;
78
524 −− rw <= ’0 ’ ;
525 −− cmd cnt <= 35;
526 −− s t a t e <= send cmd ;
527 when 16 =>−−35 =>
528 done <= ’1 ’ ;
529 end case ;
530 −− end i f ;
531 end case ;
532 end i f ;
533 end i f ;
534 end process ;
535 end architecture ;
1 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 −−
3 −−! @ f i l e r e g r e s s i on . vhd
4 −−! @br ie f The l o g i s t i c r e g r e s s i on computation un i t .
5 −−! @de t a i l s Computes the s t a t i s t i c a l p r o b a b i l i t y o f a p i x e l
6 −−! b e l ong ing to a p a r t i c u l a r c l a s s g i ven s e v e r a l
7 −−! d i f f e r e n t c l a s s c o e f f i c i e n t s and normal ized p i x e l
8 −−! data .
9 −−! @author Monica Whitaker
10 −−! @date September 2015
11 −−! @copyright Copyright (C) 2015 Ross K. Snider and
12 −−! Monica Whitaker
13 −−
14 −− This program i s f r e e so f tware : you can r e d i s t r i b u t e i t and/or
15 −− modify i t under the terms o f the GNU General Pub l i c License
16 −− as pub l i s h ed by the Free Sof tware Foundation , e i t h e r ve r s i on
17 −− 3 o f the License , or ( at your opt ion ) any l a t e r ve r s i on .
18 −−
19 −− This program i s d i s t r i b u t e d in the hope t ha t i t w i l l be
20 −− use fu l , but WITHOUT ANY WARRANTY; wi thout even the imp l i ed
21 −− warranty o f MERCHANTABILITY or FITNESS FOR A PARTICULAR
22 −− PURPOSE. See the GNU General Pub l i c License f o r more d e t a i l s .
23 −−
24 −− You shou ld have r e c e i v ed a copy o f the GNU General Pub l i c
25 −− License a long wi th t h i s program . I f not , see <h t t p ://www. gnu
. org / l i c e n s e s />.
26 −−
27 −− Monica Whitaker
28 −− E l e c t r i c a l and Computer Engineer ing
29 −− Montana S ta t e Un i v e r s i t y
30 −− 610 Cob le i gh Ha l l
31 −− Bozeman , MT 59717
32 −− monica . whitaker@msu . montana . edu
33 −−
79
34 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
35 l ibrary IEEE ; −−! Use standard l i b r a r y .
36 use IEEE . STD LOGIC 1164 .ALL; −−! Use standard l o g i c e lements .
37 use IEEE .NUMERIC STD.ALL; −−! Use numeric s tandard .
38 use IEEE .MATHREAL.ALL; −−! Use r e a l math l i b r a r y
39
40 use work . Sensor Package . a l l ; −−! Pro jec t cons tan t s package
41 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
42 −−
43 −−! @br ie f r e g r e s s i on
44 −−! @de t a i l s Computes the s t a t i s t i c a l p r o b a b i l i t y o f a p i x e l
45 −−! b e l ong ing to a p a r t i c u l a r c l a s s g i ven s e v e r a l
46 −−! d i f f e r e n t c l a s s c o e f f i c i e n t s and normal ized p i x e l
47 −−! data .
48 −−! @param TOTAL INPUT SIZE Si ze o f a l l p a r a l l e l
p i x e l data .
49 −−! @param WORD SIZE Standard word s i z e
50 −−! @param i n p u t c l k P i x e l c l o c k
51 −−! @param enab l e i n Enable s i g n a l from HPS
52 −−! @param s u p e r p i x e l i n Vector o f a l l r e l e v an t
p i x e l in format ion f o r each p a r a l l e l channel
53 −−! @param p i x e l r e s u l t s o u t Vector o f p r o b a b i l i t i e s
and p i x e l number
54 −−! @param p i x e l r e s u l t s f l a g o u t Flag i n d i c a t i n g new
r e s u l t s on output
55 −−! @param f r ame f l a g ou t Flag to i n d i c a t e new
frame
56 −−! @param f a s t c l k Clock running at t r i p l e
the speed o f
57 −−! the input c l o c k
58 −−! @param hp s c l k Clock f o r s i g n a l s from
HPS
59 −−! @param r s t n System ac t i v e−low r e s e t
s i g n a l
60 −−! @param da t a v a l i d i n Ind i c a t e s new data
presen t on
61 −−! s u p e r p i x e l i n
62 −−! @param c l e a r p i x e l i n Ind i c a t e s bad p i x e l and
t r i g g e r to c l e a r the cu r r en t l y p roce s s ing p i x e l when high
63 −−! @param avs s1 r ead Read r e que s t from HPS
64 −−! @param av s s 1 w r i t e Write r e que s t from HPS
65 −−! @param av s s1 add r e s s Data address from HPS
66 −−! @param avs s1 r eadda ta Output data f o r HPS
67 −−! @param av s s 1 wr i t e d a t a Input data from HPS
68 −−
69 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
70 entity r e g r e s s i o n i s
80
71 generic (
72 TOTAL INPUT SIZE : natura l :=
73 NUMBEROF PARALLEL CHANNELS ∗ SUPER PIXEL SIZE ;
74 WORD SIZE : natura l := 32
75 ) ;
76 port (
77 i npu t c l k : in s t d l o g i c ;
78 enab l e i n : in s t d l o g i c ;
79 s u p e r p i x e l i n : in s t d l o g i c v e c t o r (
TOTAL INPUT SIZE − 1 downto 0) ;
80 p i x e l r e s u l t s o u t : out s t d l o g i c v e c t o r (
NUMBER OF CLASSES∗WORD SIZE+PIXEL ADDRESS SIZE − 1
downto 0) ;
81 p i x e l r e s u l t s f l a g o u t : out s t d l o g i c ;
82 f a s t c l k : in s t d l o g i c ;
83 hps c l k : in s t d l o g i c ;
84 hp s r e s e t : in s t d l o g i c ;
85 r s t n : in s t d l o g i c ;
86 da t a v a l i d i n : in s t d l o g i c ;
87 c l e a r p i x e l i n : in s t d l o g i c ;
88
89 av s s 1 r ead : in s t d l o g i c ;
90 av s s 1 w r i t e : in s t d l o g i c ;
91 av s s 1 add r e s s : in s t d l o g i c v e c t o r (31 downto
0) ;
92 avs s1 r eaddata : out s t d l o g i c v e c t o r (31 downto
0) ;
93 av s s 1 wr i t eda ta : in s t d l o g i c v e c t o r (31 downto
0)
94 ) ;
95 end entity ;
96
97 architecture r t l of r e g r e s s i o n i s
98
99 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
100 −− Component De f i n i t i o n s
101 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
102 component normal ize i s −−15 c y c l e l a t ency
103 port (
104 c l k : in s t d l o g i c ;
105 r s t n : in s t d l o g i c ;
106 da t a v a l i d i n : in s t d l o g i c ;
107 data in : in s t d l o g i c v e c t o r (31 downto 0) ;
108 dark in : in s t d l o g i c v e c t o r (31 downto 0) ;
109 l i g h t I i n : in s t d l o g i c v e c t o r (31 downto 0) ;
110 mean in : in s t d l o g i c v e c t o r (31 downto 0) ;
111 s t ddev I i n : in s t d l o g i c v e c t o r (31 downto 0) ;
81
112 normal i zed out : out s t d l o g i c v e c t o r (31 downto 0)
113 ) ;
114 end component normal ize ;
115
116 component f p mu l t acc i s −−4 c y c l e s
117 port (
118 a : in s t d l o g i c v e c t o r (31 downto 0) :=
119 ( others => ’ 0 ’ ) ;
120 acc : in s t d l o g i c := ’ 0 ’ ;
121 a r e s e t : in s t d l o g i c := ’ 0 ’ ;
122 b : in s t d l o g i c v e c t o r (31 downto 0) :=
123 ( others => ’ 0 ’ ) ;
124 c l k : in s t d l o g i c := ’ 0 ’ ;
125 q : out s t d l o g i c v e c t o r (31 downto 0)
126 ) ;
127 end component ;
128
129 component memory block i s
130 generic (
131 num elements a : natura l ;
132 num elements b : natura l ;
133 s i z e a dd r e s s a : natura l ;
134 s i z e a dd r e s s b : natura l ;
135 s i z e word a : natura l ;
136 s i z e word b : natura l ;
137 mem init : s t r i n g := "UNUSED"
138 ) ;
139 port (
140 addre s s a : in s t d l o g i c v e c t o r ( s i z e add r e s s a −1
downto 0) ;
141 addres s b : in s t d l o g i c v e c t o r ( s i z e add r e s s b −1
downto 0) ;
142 c l o ck a : in s t d l o g i c := ’ 1 ’ ;
143 c l o ck b : in s t d l o g i c := ’ 1 ’ ;
144 data a : in s t d l o g i c v e c t o r ( s i z e word a−1
downto 0) ;
145 data b : in s t d l o g i c v e c t o r ( s i ze word b−1
downto 0) ;
146 wren a : in s t d l o g i c := ’ 0 ’ ;
147 wren b : in s t d l o g i c := ’ 0 ’ ;
148 q a : out s t d l o g i c v e c t o r ( s i z e word a−1 downto
0) ;
149 q b : out s t d l o g i c v e c t o r ( s i ze word b−1 downto
0)
150 ) ;
151 end component memory block ;
152
82
153 component f i x e d t o f l o a t i s −−2 c y c l e s
154 port (
155 a : in s t d l o g i c v e c t o r (15 downto 0) :=
156 ( others => ’ 0 ’ ) ;
157 a r e s e t : in s t d l o g i c := ’ 0 ’ ;
158 c l k : in s t d l o g i c := ’ 0 ’ ;
159 q : out s t d l o g i c v e c t o r (31 downto 0)
160 ) ;
161 end component f i x e d t o f l o a t ;
162
163 component channel sum i s
164 generic (
165 WORD SIZE : natura l := 32
166 ) ;
167 port (
168 c l k : in s t d l o g i c ;
169 f a s t c l k : in s t d l o g i c ;
170 r s t n : in s t d l o g i c ;
171 i n t e r c e p t i n : in s t d l o g i c v e c t o r (WORD SIZE−1
downto 0) ;
172 data in : in s t d l o g i c v e c t o r (
173 NUMBEROF PARALLEL CHANNELS∗
WORD SIZE−1 downto 0) ;
174 r e s u l t o u t : out s t d l o g i c v e c t o r (WORD SIZE−1
downto 0)
175 ) ;
176 end component ;
177 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
178 −− Constant De f i n i t i o n s
179 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
180 −−v a l i d va l u e s = 1 , 2 , 4 , 8 , 16
181 constant PSEUDO PARALLEL CHANNELS : natura l := 8 ;
182
183 constant MEMORYWORDSHPS : natura l :=
184 (NUMBER OF SPECTRAL BINS/
NUMBEROF PARALLEL CHANNELS) ∗
185 PSEUDO PARALLEL CHANNELS;
186 constant HPS MEM ADDR SIZE : natura l := natura l ( l og2
( r e a l (MEMORYWORDSHPS) ) ) ;
187
188 constant CONVERSION LEVELS : natura l := 3 ;
189 constant NORMALIZE LEVELS : natura l := 15 ;
190 constant PRODUCT LEVELS : natura l := 4 ;
191 −−3 c y c l e s per add
192 constant COMBINATION LEVELS : natura l := 2∗(
NUMBEROF PARALLEL CHANNELS) ;
83
193 constant NUMBER LEVELS : natura l :=
CONVERSION LEVELS + NORMALIZE LEVELS + PRODUCT LEVELS +
COMBINATION LEVELS + 2 ;
194
195 constant ZEROS : s t d l o g i c v e c t o r (31
downto 0) := x"00000000" ;
196 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
197 −− Type De f i n i t i o n s
198 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
199 type wr i t e a r r ay i s array (1 to NUMBER OF CLASSES) of
s t d l o g i c ;
200 type word array i s array (1 to NUMBER OF CLASSES) of
201 s t d l o g i c v e c t o r (WORD SIZE − 1
downto 0) ;
202 type c l a s s a r r a y i s array (1 to NUMBER OF CLASSES) of
203 s t d l o g i c v e c t o r (WORD SIZE∗
204 PSEUDO PARALLEL CHANNELS − 1
downto 0) ;
205
206 type row array i s array (1 to
NUMBEROF PARALLEL CHANNELS) of s t d l o g i c v e c t o r (
SPECTRAL BIN ADDRESS SIZE − 1 downto 0) ;
207 type column array i s array (1 to
NUMBEROF PARALLEL CHANNELS) of s t d l o g i c v e c t o r (
PIXEL ADDRESS SIZE − 1 downto 0) ;
208 type i n da t a a r r ay i s array (1 to
NUMBEROF PARALLEL CHANNELS) of s t d l o g i c v e c t o r (DATA SIZE
− 1 downto 0) ;
209 type data ar ray i s array (1 to
NUMBEROF PARALLEL CHANNELS) of s t d l o g i c v e c t o r (WORD SIZE
− 1 downto 0) ;
210 type p a r t i a l s a r r a y i s array (1 to NUMBER OF CLASSES) of
data ar ray ;
211 type product ar ray i s array (1 to NUMBER OF CLASSES) of
s t d l o g i c v e c t o r (NUMBEROF PARALLEL CHANNELS∗
WORD SIZE − 1 downto 0) ;
212
213 type prod array i s array (1 to
NUMBEROF PARALLEL CHANNELS) of s t d l o g i c ;
214 type p rod s i g a r r ay i s array (1 to NUMBER OF CLASSES) of
prod array ;
215
216 type d a t a l e v e l s a r r a y i s array (1 to NUMBER LEVELS) of
data ar ray ;
217 type b in s a r r ay i s array (1 to NUMBER LEVELS) of
row array ;
84
218 type p i x e l s a r r a y i s array (1 to NUMBER LEVELS) of
column array ;
219 type l o g i c a r r a y i s array (1 to NUMBER LEVELS) of
s t d l o g i c ;
220 type mem addr array i s array (1 to NUMBER LEVELS) of
s t d l o g i c v e c t o r ( natura l ( trunc ( log2 ( r e a l (
NUMBER OF SPECTRAL BINS /
NUMBEROF PARALLEL CHANNELS) ) ) )−1 downto 0) ;
221 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
222 −− S igna l De f i n i t i o n s
223 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
224 signal da ta c l k : s t d l o g i c ;
225 signal r e s e t : s t d l o g i c ;
226 signal mean write : s t d l o g i c ;
227 signal s tddev I wr i t e : s t d l o g i c ;
228
229 signal va l id , p i x e l e r r : l o g i c a r r a y ;
230
231 signal row0 : row array ;
232 signal column0 : column array ;
233 signal bin : b i n s a r r ay ;
234 signal p i x e l : p i x e l s a r r a y ;
235 signal mem address : mem addr array ;
236 signal da t a f l o a t : data ar ray ;
237 signal normal : data ar ray ;
238 signal data : i n da t a a r r ay ;
239 signal l i g h t I : d a t a l e v e l s a r r a y ;
240 signal dark : d a t a l e v e l s a r r a y ;
241 signal c l a s s , c l a s s s i g : c l a s s a r r a y ;
242
243 signal i n t e r c e p t s : word array ;
244 signal r e s u l t s : word array ;
245 signal r e su l t s t emp : word array ;
246 signal r e a d c l a s s : word array ;
247 signal r e ad i n t e r c e p t : word array ;
248 signal r e a d r e s u l t : word array ;
249 signal z e r o a r r ay : word array :=
250 ( others => x"00000000" ) ;
251
252 signal p a r t i a l : p a r t i a l s a r r a y ;
253
254 signal f i n a l p a r t i a l : p roduct ar ray ;
255
256 signal c l a s s w r i t e : w r i t e a r r ay ;
257 signal i n t e r c e p t w r i t e : w r i t e a r r ay ;
258 signal r e s u l t w r i t e : w r i t e a r r ay ;
259
85
260 signal acc : p r od s i g a r r ay ;
261
262 signal mean : s t d l o g i c v e c t o r (
PSEUDO PARALLEL CHANNELS∗ WORD SIZE − 1 downto 0) ;
263 signal s tddevI : s t d l o g i c v e c t o r (
PSEUDO PARALLEL CHANNELS∗ WORD SIZE − 1 downto 0) ;
264
265 signal c l a s s add r : s t d l o g i c v e c t o r (HPS MEM ADDR SIZE
−1 downto 0) ;
266 signal mean addr : s t d l o g i c v e c t o r (HPS MEM ADDR SIZE
−1 downto 0) ;
267 signal stddev addr : s t d l o g i c v e c t o r (HPS MEM ADDR SIZE
−1 downto 0) ;
268
269 signal read mean : s t d l o g i c v e c t o r (WORD SIZE − 1
downto 0) ;
270 signal r ead s tddev I : s t d l o g i c v e c t o r (WORD SIZE − 1
downto 0) ;
271
272 begin
273
274 −− l a s t b in o f p i x e l has f i n i s h e d proce s s ing
275 p i x e l r e s u l t s f l a g o u t <= ’1 ’ when ( bin (NUMBER LEVELS) (
NUMBEROF PARALLEL CHANNELS) = s t d l o g i c v e c t o r (
to uns igned (NUMBER OF SPECTRAL BINS−1,
SPECTRAL BIN ADDRESS SIZE) ) ) else ’ 0 ’ ;
276
277 i memory block means : memory block −−fpga on b , hps on a
278 generic map(
279 num elements a => MEMORYWORDSHPS,
280 num elements b => NUMBER OF SPECTRAL BINS /
281 NUMBER OF PARALLEL CHANNELS,
282 s i z e a dd r e s s a => HPS MEM ADDR SIZE,
283 s i z e a dd r e s s b => natura l ( trunc ( log2 ( r e a l (
NUMBER OF SPECTRAL BINS /
284 NUMBEROF PARALLEL CHANNELS) ) ) ) ,
285 s i z e word a => WORD SIZE,
286 s i z e word b => PSEUDO PARALLEL CHANNELS ∗ WORD SIZE,
287 mem init => "means.mif"
288 )
289 port map(
290 addre s s a => mean addr ,
291 addres s b => mem address (CONVERSION LEVELS − 1) ,
292 c l o ck a => hps c lk ,
293 c l o ck b => data c lk ,
294 data a => avs s1 wr i t eda ta ,
295 data b => ( others => ’ 0 ’ ) ,
86
296 wren a => mean write ,
297 wren b => ’ 0 ’ ,
298 q a => read mean ,
299 q b => mean
300 ) ;
301
302 mean addr <= avs s 1 add r e s s (HPS MEM ADDR SIZE−1 downto 0)
when av s s 1 add r e s s (10) = ’ 1 ’ ;
303
304 i memory block stddevs : memory block −−read on b , wr i t e on a
305 generic map(
306 num elements a => MEMORYWORDSHPS,
307 num elements b => NUMBER OF SPECTRAL BINS /
308 NUMBER OF PARALLEL CHANNELS,
309 s i z e a dd r e s s a => HPS MEM ADDR SIZE,
310 s i z e a dd r e s s b => natura l ( trunc ( log2 ( r e a l (
NUMBER OF SPECTRAL BINS /
311 NUMBEROF PARALLEL CHANNELS) ) ) ) ,
312 s i z e word a => WORD SIZE,
313 s i z e word b => PSEUDO PARALLEL CHANNELS ∗ WORD SIZE,
314 mem init => "stddevs.mif"
315 )
316 port map(
317 addre s s a => stddev addr ,
318 addres s b => mem address (CONVERSION LEVELS − 1) ,
319 c l o ck a => hps c lk ,
320 c l o ck b => data c lk ,
321 data a => avs s1 wr i t eda ta ,
322 data b => ( others => ’ 0 ’ ) ,
323 wren a => s tddev I wr i t e ,
324 wren b => ’ 0 ’ ,
325 q a => read stddevI ,
326 q b => s tddevI
327 ) ;
328
329 stddev addr <= avs s 1 add r e s s (HPS MEM ADDR SIZE−1 downto 0)
when av s s 1 add r e s s (12) = ’ 1 ’ ;
330
331 g normal i ze : for j in 1 to NUMBEROF PARALLEL CHANNELS
generate
332
333 i f i x e d t o f l o a t : f i x e d t o f l o a t
334 port map(
335 a => data ( j ) ,
336 a r e s e t => r e s e t ,
337 c l k => data c lk ,
338 q => da t a f l o a t ( j )
87
339 ) ;
340
341 −−normal ize by l i g h t , dark , mean , s tddev
342 i n o rma l i z e : normal ize
343 port map(
344 c l k => data c lk ,
345 r s t n => r s t n ,
346 da t a v a l i d i n => va l i d (CONVERSION LEVELS) ,
347 data in => da t a f l o a t ( j ) ,
348 dark in => dark (CONVERSION LEVELS) ( j ) ,
349 l i g h t I i n => l i g h t I (CONVERSION LEVELS) ( j ) ,
350 mean in => mean(WORD SIZE∗ j−1 downto
WORD SIZE∗( j−1) ) ,
351 s t ddev I i n => s tddevI (WORD SIZE∗ j−1 downto
WORD SIZE∗( j−1) ) ,
352 normal i zed out => normal ( j )
353 ) ;
354
355 end generate ;
356
357 c l a s s add r <= s t d l o g i c v e c t o r ( unsigned ( av s s 1 add r e s s (
358 HPS MEM ADDR SIZE−1 downto 0) ) − 1) ;
359
360 g c l a s s i f y : for i in 1 to NUMBER OF CLASSES generate
361
362 i memory b l o ck in t e r c ep t s : memory block
363 generic map(
364 num elements a => 1 ,
365 num elements b => 1 ,
366 s i z e a dd r e s s a => 1 ,
367 s i z e a dd r e s s b => 1 ,
368 s i z e word a => WORD SIZE,
369 s i z e word b => WORD SIZE,
370 mem init => "UNUSED"
371 )
372 port map(
373 addre s s a => "0" ,
374 addres s b => "0" ,
375 c l o ck a => data c lk ,
376 c l o ck b => hps c lk ,
377 data a => ( others => ’ 0 ’ ) ,
378 data b => avs s1 wr i t eda ta ,
379 wren a => ’ 0 ’ ,
380 wren b => i n t e r c e p t w r i t e ( i ) ,
381 q a => i n t e r c e p t s ( i ) ,
382 q b => r e ad i n t e r c e p t ( i )
383 ) ;
88
384
385 i memory b l o ck c l a s s e s : memory block −−FPGA on b , HPS on
a
386 generic map(
387 num elements a => MEMORYWORDSHPS,
388 num elements b => NUMBER OF SPECTRAL BINS /
389 NUMBER OF PARALLEL CHANNELS,
390 s i z e a dd r e s s a => HPS MEM ADDR SIZE,
391 s i z e a dd r e s s b => natura l ( trunc ( log2 ( r e a l (
392 NUMBER OF SPECTRAL BINS /
393 NUMBEROF PARALLEL CHANNELS) ) ) ) ,
394 s i z e word a => WORD SIZE,
395 s i z e word b => (PSEUDO PARALLEL CHANNELS ∗
396 WORD SIZE) ,
397 mem init => "UNUSED"
398 )
399 port map(
400 addre s s a => c l a s s addr ,
401 addres s b => mem address (NORMALIZE LEVELS+
CONVERSION LEVELS−1) ,
402 c l o ck a => hps c lk ,
403 c l o ck b => data c lk ,
404 data a => avs s1 wr i t eda ta ,
405 data b => ( others => ’ 0 ’ ) ,
406 wren a => c l a s s w r i t e ( i ) ,
407 wren b => ’ 0 ’ ,
408 q a => r e a d c l a s s ( i ) ,
409 q b => c l a s s ( i )
410 ) ;
411
412 −−used in inner product c a l c u l a t i o n
413 c l a s s s i g ( i ) <= c l a s s ( i ) when va l i d (NORMALIZE LEVELS +
CONVERSION LEVELS)= ’1 ’ else ( others=> ’0 ’) ;
414
415 −−Refer to r e g i s t e r d e s c r i p t i o n document
416 i n t e r c e p t w r i t e ( i ) <= av s s 1 w r i t e when
417 t o i n t e g e r ( unsigned ( av s s 1 add r e s s (31 downto 18) ) ) =
1 and t o i n t e g e r ( unsigned ( av s s 1 add r e s s (17
downto HPS MEM ADDR SIZE) ) ) = i and av s s 1 add r e s s
(HPS MEM ADDR SIZE−1 downto 0) = ZEROS(
HPS MEM ADDR SIZE−1 downto 0) else ’ 0 ’ ;
418
419 c l a s s w r i t e ( i ) <= av s s 1 w r i t e when
420 t o i n t e g e r ( unsigned ( av s s 1 add r e s s (31 downto 18) ) ) = 1
and t o i n t e g e r ( unsigned ( av s s 1 add r e s s (17 downto
HPS MEM ADDR SIZE) ) ) = i and av s s 1 add r e s s (
89
HPS MEM ADDR SIZE−1 downto 0) /= ZEROS(
HPS MEM ADDR SIZE−1 downto 0) else ’ 0 ’ ;
421
422 r e s u l t w r i t e ( i ) <= ’1 ’ when ( bin (NUMBER LEVELS) (
NUMBEROF PARALLEL CHANNELS) = s t d l o g i c v e c t o r (
to uns igned (NUMBER OF SPECTRAL BINS−1,
SPECTRAL BIN ADDRESS SIZE) ) ) OR p i x e l e r r (
NUMBER LEVELS) = ’1 ’ else ’ 0 ’ ;
423
424 −−p i x e l r e s u l t s o u t => <pixel num , c l a s s r e s u l t s ( x16 )>
425 p i x e l r e s u l t s o u t (NUMBER OF CLASSES∗WORD SIZE+
PIXEL ADDRESS SIZE−1 downto NUMBER OF CLASSES∗
WORD SIZE) <= p i x e l (NUMBER LEVELS) (1 ) ;
426
427 p i x e l r e s u l t s o u t (WORD SIZE∗ i−1 downto WORD SIZE∗( i −1) )
<= r e s u l t s ( i ) ;
428
429 i memory b l o ck r e su l t s : memory block −−FPGA on a , HPS on
b
430 generic map(
431 num elements a => NUMBER OF PIXELS,
432 num elements b => NUMBER OF PIXELS,
433 s i z e a dd r e s s a => PIXEL ADDRESS SIZE ,
434 s i z e a dd r e s s b => PIXEL ADDRESS SIZE ,
435 s i z e word a => WORD SIZE,
436 s i z e word b => WORD SIZE,
437 mem init => "UNUSED"
438 )
439 port map(
440 addre s s a => p i x e l (NUMBER LEVELS) (1 ) ,
441 addres s b => av s s 1 add r e s s (PIXEL ADDRESS SIZE−1
downto 0) ,
442 c l o ck a => data c lk ,
443 c l o ck b => hps c lk ,
444 data a => r e s u l t s ( i ) ,
445 data b => ( others => ’ 0 ’ ) ,
446 wren a => r e s u l t w r i t e ( i ) ,
447 wren b => ’ 0 ’ ,
448 q a => open ,
449 q b => r e a d r e s u l t ( i )
450 ) ;
451
452
453 −−add p i x e l r e s u l t s across p a r a l l e l channe l s
454 i channel sum sum : channel sum
455 generic map(
456 WORD SIZE => WORD SIZE
90
457 )
458 port map(
459 c l k => data c lk ,
460 f a s t c l k => f a s t c l k ,
461 r s t n => r s t n ,
462 i n t e r c e p t i n => i n t e r c e p t s ( i ) ,
463 data in => f i n a l p a r t i a l ( i ) ,
464 r e s u l t o u t => r e su l t s t emp ( i )
465 ) ;
466
467 r e s u l t l o c k : process ( data c lk , r s t n )
468 begin
469 i f ( r s t n = ’0 ’ ) then
470 r e s u l t s ( i ) <= ze ro a r r ay ( i ) ;
471 e l s i f ( r i s i n g e d g e ( da ta c l k ) ) then
472 i f ( p i x e l e r r (NUMBER LEVELS−1) = ’0 ’ ) then
473 r e s u l t s ( i ) <= re su l t s t emp ( i ) ;
474 else
475 r e s u l t s ( i ) <= ze ro a r r ay ( i ) ;
476 end i f ;
477 end i f ;
478 end process ;
479
480 g product : for j in 1 to NUMBEROF PARALLEL CHANNELS
generate
481
482 −−do not accumulate when : beg inn ing o f p i x e l ( f i r s t 5
b in s )
483 acc ( i ) ( j ) <= ’0 ’ when ( j = 1 and bin (NORMALIZE LEVELS
+CONVERSION LEVELS) (1 ) = ZEROS(
SPECTRAL BIN ADDRESS SIZE − 1 downto 0) and va l i d (
NORMALIZE LEVELS + CONVERSION LEVELS) = ’1 ’ ) or ( j
/= 1 and bin (NORMALIZE LEVELS + CONVERSION LEVELS
) ( j ) = s t d l o g i c v e c t o r ( to uns igned ( j − 1 ,
SPECTRAL BIN ADDRESS SIZE) ) and va l i d (
NORMALIZE LEVELS + CONVERSION LEVELS) = ’1 ’) else
’ 1 ’ ;
484
485 f i n a l p a r t i a l ( i ) (WORD SIZE∗(
NUMBEROF PARALLEL CHANNELS − ( j − 1) ) − 1 downto
WORD SIZE∗(NUMBEROF PARALLEL CHANNELS − j ) ) <=
pa r t i a l ( i ) ( j ) ;
486
487 i f p mu l t a c c : fp mul t acc
488 port map(
489 a => normal ( j ) ,
490 acc => acc ( i ) ( j ) ,
91
491 a r e s e t => r e s e t ,
492 b => c l a s s s i g ( i ) (WORD SIZE∗ j−1 downto
WORD SIZE∗( j−1) ) ,
493 c l k => data c lk ,
494 q => p a r t i a l ( i ) ( j )
495 ) ;
496
497 end generate ;
498
499 end generate ;
500
501 r e s e t <= not r s t n ;
502 da ta c l k <= inpu t c l k when enab l e i n = ’1 ’ else ’ 0 ’ ;
503
504 −−s epara t e l o c a t i o n in format ion from input data
505 ba s e l o c a t i o n : for k in 1 to NUMBEROF PARALLEL CHANNELS
generate
506 row0 (k ) <= sup e r p i x e l i n ( (TOTAL INPUT SIZE −
507 (NUMBER OF PARALLEL CHANNELS−k ) ∗
SUPER PIXEL SIZE−1)
508 downto (TOTAL INPUT SIZE−
509 (NUMBER OF PARALLEL CHANNELS−k ) ∗
SUPER PIXEL SIZE−
510 SPECTRAL BIN ADDRESS SIZE) ) ;
511 column0 (k ) <= sup e r p i x e l i n ( (TOTAL INPUT SIZE−
512 (NUMBER OF PARALLEL CHANNELS−k ) ∗
SUPER PIXEL SIZE−
513 SPECTRAL BIN ADDRESS SIZE−1) downto
514 (TOTAL INPUT SIZE−(
NUMBER OF PARALLEL CHANNELS−k ) ∗
515 SUPER PIXEL SIZE−
SPECTRAL BIN ADDRESS SIZE−
516 PIXEL ADDRESS SIZE) ) ;
517 end generate ;
518
519
520 −− Address Map
521 −− 1000 − beg inn ing o f mean
522 −− 4000 − beg inn ing o f s tddev
523 −− 100000 − beg inn ing o f c l a s s c o e f f i c i e n t s
524 read mux : process ( hps c lk , hp s r e s e t )
525 begin
526 i f ( hp s r e s e t = ’1 ’ ) then
527 avs s1 r eaddata <= ( others => ’ 0 ’ ) ;
528 mean write <= ’0 ’ ;
529 s tddev I wr i t e <= ’0 ’ ;
530 e l s i f ( r i s i n g e d g e ( hps c l k ) ) then
92
531 i f ( av s s 1 r ead = ’1 ’ ) then
532 mean write <= ’0 ’ ;
533 s tddev I wr i t e <= ’0 ’ ;
534 i f ( t o i n t e g e r ( unsigned ( av s s 1 add r e s s (31 downto
10) ) ) = 1) then
535 avs s1 r eaddata <= read mean ;
536 e l s i f ( t o i n t e g e r ( unsigned ( av s s 1 add r e s s (31
downto 10) ) ) = 4) then
537 avs s1 r eaddata <= read s tddev I ;
538 e l s i f ( t o i n t e g e r ( unsigned ( av s s 1 add r e s s (31
downto 18) ) ) = 1) then
539 i f ( av s s 1 add r e s s (HPS MEM ADDR SIZE−1 downto
0) = ZEROS(
540 HPS MEM ADDR SIZE−1 downto 0) ) then
541 avs s1 r eaddata <= s t d l o g i c v e c t o r (
r e ad i n t e r c e p t (
542 t o i n t e g e r ( unsigned (
av s s 1 add r e s s (
543 17 downto
HPS MEM ADDR SIZE) ) ) ) )
;
544 else
545 avs s1 r eaddata <= s t d l o g i c v e c t o r (
r e a d c l a s s (
546 t o i n t e g e r ( unsigned (
av s s 1 add r e s s (17
547 downto HPS MEM ADDR SIZE) ) ) ) )
;
548 end i f ;
549 e l s i f ( av s s 1 add r e s s (19) = ’1 ’ ) then
550 avs s1 r eaddata <= s t d l o g i c v e c t o r (
r e a d r e s u l t (
551 t o i n t e g e r ( unsigned (
av s s 1 add r e s s (18
552 downto PIXEL ADDRESS SIZE) ) ) )
) ;
553 else
554 avs s1 r eaddata <= ( others => ’ 0 ’ ) ;
555 end i f ;
556 e l s i f ( a v s s 1 w r i t e = ’1 ’ ) then
557 avs s1 r eaddata <= ( others => ’ 0 ’ ) ;
558 i f ( t o i n t e g e r ( unsigned ( av s s 1 add r e s s (31 downto
10) ) ) = 1) then
559 mean write <= ’1 ’ ;
560 s tddev I wr i t e <= ’0 ’ ;
561 e l s i f ( t o i n t e g e r ( unsigned ( av s s 1 add r e s s (31
downto 10) ) ) = 4) then
93
562 s tddev I wr i t e <= ’1 ’ ;
563 mean write <= ’0 ’ ;
564 else
565 mean write <= ’0 ’ ;
566 s tddev I wr i t e <= ’0 ’ ;
567 end i f ;
568 else
569 avs s1 r eaddata <= ( others => ’ 0 ’ ) ;
570 mean write <= ’0 ’ ;
571 s tddev I wr i t e <= ’0 ’ ;
572 end i f ;
573 end i f ;
574 end process ;
575
576 −−p i p e l i n e f o r data in format ion
577 data proc : process ( data c lk , r s t n )
578 begin
579 i f ( r s t n = ’0 ’ ) then
580 for k in 1 to NUMBER LEVELS loop
581 bin (k ) <= ( others => ( others => ’0 ’) ) ;
582 p i x e l ( k ) <= ( others => ( others => ’0 ’) ) ;
583 end loop ;
584 e l s i f ( r i s i n g e d g e ( da ta c l k ) ) then
585
586 for k in 1 to NUMBEROF PARALLEL CHANNELS loop
587 −−l ock−in input data
588 data (k ) <= sup e r p i x e l i n ( (
TOTAL INPUT SIZE−
589 (NUMBER OF PARALLEL CHANNELS−k ) ∗
590 SUPER PIXEL SIZE−
SPECTRAL BIN ADDRESS SIZE−
591 PIXEL ADDRESS SIZE−1) downto
592 (TOTAL INPUT SIZE−(
593 NUMBER OF PARALLEL CHANNELS−k ) ∗
594 SUPER PIXEL SIZE−
DATA PACKAGE SIZE) ) ;
595
596 l i g h t I (1 ) ( k ) <= sup e r p i x e l i n ( (
TOTAL INPUT SIZE−
597 (NUMBER OF PARALLEL CHANNELS−k ) ∗
598 SUPER PIXEL SIZE−
DATA PACKAGE SIZE−1) downto
599 (TOTAL INPUT SIZE−(
600 NUMBER OF PARALLEL CHANNELS−k ) ∗
601 SUPER PIXEL SIZE−
DATA PACKAGE SIZE−
602 LIGHT CORRECT SIZE) ) ;
94
603
604 dark (1 ) ( k ) <= sup e r p i x e l i n ( (
TOTAL INPUT SIZE−
605 (NUMBER OF PARALLEL CHANNELS−k ) ∗
606 SUPER PIXEL SIZE−
DATA PACKAGE SIZE−
607 LIGHT CORRECT SIZE−1) downto
608 (TOTAL INPUT SIZE−(
609 NUMBER OF PARALLEL CHANNELS−k ) ∗
610 SUPER PIXEL SIZE−
DATA PACKAGE SIZE−
611 LIGHT CORRECT SIZE−
DARK CORRECT SIZE) ) ;
612 end loop ;
613
614 for k in 1 to NUMBER LEVELS loop
615 i f ( k = 1) then
616 va l i d ( k ) <= da t a v a l i d i n ;
617 p i x e l e r r ( k ) <= c l e a r p i x e l i n ;
618 bin (k ) <= row0 ;
619 p i x e l ( k ) <= column0 ;
620 i f ( d a t a v a l i d i n = ’1 ’ ) then
621 i f ( row0 (k ) = ZEROS(
SPECTRAL BIN ADDRESS SIZE − 1 downto
0) ) then
622 mem address ( k ) <= ZEROS( natura l ( trunc
( log2 ( r e a l (NUMBER OF SPECTRAL BINS
/ NUMBEROF PARALLEL CHANNELS) ) ) )
− 1 downto 0) ;
623 else −−only increment address wi th each
v a l i d input
624 mem address ( k ) <= s t d l o g i c v e c t o r (
unsigned (mem address ( k ) ) + 1) ;
625 end i f ;
626 end i f ;
627 else
628 va l i d ( k ) <= va l i d (k−1) ;
629 p i x e l e r r ( k ) <= p i x e l e r r (k−1) ;
630 bin (k ) <= bin (k−1) ;
631 p i x e l ( k ) <= p i x e l (k−1) ;
632 mem address ( k ) <= mem address (k−1) ;
633 l i g h t I ( k ) <= l i g h t I (k−1) ;
634 dark (k ) <= dark (k−1) ;
635 end i f ;
636 end loop ;
637 end i f ;
638 end process ;
95
639 end architecture ;
1 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 −−
3 −−! @ f i l e normal ize . vhd
4 −−! @br ie f Implements norma l i za t ion o f p i x e l data
5 −−! @de t a i l s U t i l i z e s mu l t i p l i c a t i o n and su b t r a c t i on
6 −−! megafunct ions to normal ize incoming f l o a t i n g
7 −−! po in t data va l u e s
8 −−! @author Monica Whitaker
9 −−! @date August 2016
10 −−! @copyright Copyright (C) 2016 Ross K. Snider and
11 −−! Monica Whitaker
12 −−
13 −− This program i s f r e e so f tware : you can r e d i s t r i b u t e i t and/or
14 −− modify i t under the terms o f the GNU General Pub l i c License
15 −− as pub l i s h ed by the Free Sof tware Foundation , e i t h e r ve r s i on
16 −− 3 o f the License , or ( at your opt ion ) any l a t e r ve r s i on .
17 −−
18 −− This program i s d i s t r i b u t e d in the hope t ha t i t w i l l be
19 −− use fu l , but WITHOUT ANY WARRANTY; wi thout even the imp l i ed
20 −− warranty o f MERCHANTABILITY or FITNESS FOR A PARTICULAR
21 −− PURPOSE. See the GNU General Pub l i c License f o r more d e t a i l s .
22 −−
23 −− You shou ld have r e c e i v ed a copy o f the GNU General Pub l i c
24 −− License a long wi th t h i s program . I f not , see <h t t p ://www. gnu
. org / l i c e n s e s />.
25 −−
26 −− Monica Whitaker
27 −− E l e c t r i c a l and Computer Engineer ing
28 −− Montana S ta t e Un i v e r s i t y
29 −− 610 Cob le i gh Ha l l
30 −− Bozeman , MT 59717
31 −− monica . whitaker@msu . montana . edu
32 −−
33 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
34 l ibrary IEEE ; −−! Use standard l i b r a r y .
35 use IEEE . STD LOGIC 1164 .ALL ; −−! Use standard l o g i c e lements .
36 use IEEE .NUMERIC STD.ALL ; −−! Use numeric s tandard .
37 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
38 −−
39 −−! @br ie f normal ize
40 −−! @de t a i l s U t i l i z e s mu l t i p l i c a t i o n and su b t r a c t i on
41 −−! megafunct ions to normal ize incoming f l o a t i n g
42 −−! po in t data va l u e s
43 −−! @param c l k Input c l k
44 −−! @param r s t n Act ive low r e s e t
96
45 −−! @param da t a v a l i d i n Enable s i g n a l f o r v a l i d input
46 −−! @param da ta in P i x e l data va lue
47 −−! @param dark in Dark co r r e c t i on va lue
48 −−! @param l i g h t I i n Inve r t ed l i g h t c o r r e c t i on va lue
49 −−! @param mean in Mean va lue
50 −−! @param s t d d e v I i n Inve r t ed standard d e v i a t i on va lue
51 −−! @param norma l i zed out Normalized p i x e l data
52 −−
53 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
54 entity normal ize i s
55 port (
56 c l k : in s t d l o g i c ;
57 r s t n : in s t d l o g i c ;
58 da t a v a l i d i n : in s t d l o g i c ;
59 data in : in s t d l o g i c v e c t o r (31 downto 0) ;
60 dark in : in s t d l o g i c v e c t o r (31 downto 0) ;
61 l i g h t I i n : in s t d l o g i c v e c t o r (31 downto 0) ;
62 mean in : in s t d l o g i c v e c t o r (31 downto 0) ;
63 s t ddev I i n : in s t d l o g i c v e c t o r (31 downto 0) ;
64 normal i zed out : out s t d l o g i c v e c t o r (31 downto 0)
65 ) ;
66 end entity normal ize ;
67
68 architecture r t l of normal ize i s
69 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
70 −− Component De f i n i t i o n s
71 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
72 component f p f un c s ub t r a c t i s −−3 cyc
73 port (
74 a : in s t d l o g i c v e c t o r (31 downto 0) :=
75 ( others => ’ 0 ’ ) ;
76 a r e s e t : in s t d l o g i c := ’ 0 ’ ;
77 b : in s t d l o g i c v e c t o r (31 downto 0) :=
78 ( others => ’ 0 ’ ) ;
79 c l k : in s t d l o g i c := ’ 0 ’ ;
80 q : out s t d l o g i c v e c t o r (31 downto 0)
81 ) ;
82 end component f p f un c s ub t r a c t ;
83
84 component f p func mul t i s −−3 cyc
85 port (
86 a : in s t d l o g i c v e c t o r (31 downto 0) :=
87 ( others => ’ 0 ’ ) ;
88 a r e s e t : in s t d l o g i c := ’ 0 ’ ;
89 b : in s t d l o g i c v e c t o r (31 downto 0) :=
90 ( others => ’ 0 ’ ) ;
91 c l k : in s t d l o g i c := ’ 0 ’ ;
97
92 q : out s t d l o g i c v e c t o r (31 downto 0)
93 ) ;
94 end component f p func mul t ;
95
96 component gte compare i s
97 port (
98 a : in s t d l o g i c v e c t o r (31 downto 0) :=
99 ( others => ’ 0 ’ ) ;
100 a r e s e t : in s t d l o g i c := ’ 0 ’ ;
101 b : in s t d l o g i c v e c t o r (31 downto 0) :=
102 ( others => ’ 0 ’ ) ;
103 c l k : in s t d l o g i c := ’ 0 ’ ;
104 q : out s t d l o g i c v e c t o r (0 downto 0)
105 ) ;
106 end component gte compare ;
107 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
108 −− Constant De f i n i t i o n s
109 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
110 constant NUMBER LEVELS : natura l := 15 ;
111 constant ZEROS : s t d l o g i c v e c t o r (31 downto 0) :=
112 ( others => ’ 0 ’ ) ;
113 constant ONE : s t d l o g i c v e c t o r (31 downto 0) :=
114 x"3F800000" ;
115 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
116 −− Type De f i n i t i o n s
117 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
118 type va l i d a r r a y i s array (1 to NUMBER LEVELS) of s t d l o g i c ;
119 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
120 −− S igna l De f i n i t i o n s
121 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
122 signal da ta va l i d : v a l i d a r r a y ;
123
124 signal l i g h t I 1 , l i g h t I 2 , l i g h t I 3 , l i g h t I 4 , l i g h t I 5 :
125 s t d l o g i c v e c t o r (31 downto 0) ;
126 signal mean1 , mean2 , mean3 , mean4 , mean5 , mean6 , mean7 , mean8 :
127 s t d l o g i c v e c t o r (31 downto 0) ;
128 signal stdDev1 , stdDev2 , stdDev3 , stdDev4 , stdDev5 , stdDev6 ,
stdDev7 , stdDev8 , stdDev9 , stdDev10 , stdDev11 :
s t d l o g i c v e c t o r (31 downto 0) ;
129 signal d i f f t emp : s t d l o g i c v e c t o r (31 downto 0) ;
130 signal d i f f : s t d l o g i c v e c t o r (31 downto 0) ;
131 signal corrected temp : s t d l o g i c v e c t o r (31 downto 0) ;
132 signal co r r e c t ed : s t d l o g i c v e c t o r (31 downto 0) ;
133 signal normalized temp : s t d l o g i c v e c t o r (31 downto 0) ;
134 signal normal ized : s t d l o g i c v e c t o r (31 downto 0) ;
135 signal r e s u l t : s t d l o g i c v e c t o r (0 downto 0) ;
136
98
137 signal r e s e t : s t d l o g i c ;
138
139 begin
140
141 r e s e t <= not r s t n ;
142 −−Use Dark and Ligh t to normal ize between 0 and 1
143 dark sub : f p f un c s ub t r a c t
144 port map(
145 a => data in ,
146 a r e s e t => r e s e t ,
147 b => dark in ,
148 c l k => c lk ,
149 q => d i f f t emp
150 ) ;
151
152 l i g h t mu l t : fp func mul t
153 port map(
154 a => d i f f ,
155 a r e s e t => r e s e t ,
156 b => l i g h t I 4 ,
157 c l k => c lk ,
158 q => corrected temp
159 ) ;
160
161 correct compare : gte compare
162 port map(
163 a => corrected temp ,
164 a r e s e t => r e s e t ,
165 b => ONE,
166 c l k => c lk ,
167 q => r e s u l t
168 ) ;
169
170 mean sub : f p f un c s ub t r a c t
171 port map(
172 a => cor rec ted ,
173 a r e s e t => r e s e t ,
174 b => mean8 ,
175 c l k => c lk ,
176 q => normalized temp
177 ) ;
178
179 stddev mult : fp func mul t
180 port map(
181 a => normalized temp ,
182 a r e s e t => r e s e t ,
183 b => stdDev11 ,
99
184 c l k => c lk ,
185 q => normal ized
186 ) ;
187
188 proc : process ( c lk , r s t n )
189 begin
190 i f ( r s t n = ’0 ’ ) then
191 normal i zed out <= ZEROS;
192 e l s i f ( r i s i n g e d g e ( c l k ) ) then
193 −−p i p e l i n e va l u e s
194 l i g h t I 1 <= l i g h t I i n ;
195 l i g h t I 2 <= l i g h t I 1 ;
196 l i g h t I 3 <= l i g h t I 2 ;
197 l i g h t I 4 <= l i g h t I 3 ;
198
199 mean1 <= mean in ;
200 mean2 <= mean1 ;
201 mean3 <= mean2 ;
202 mean4 <= mean3 ;
203 mean5 <= mean4 ;
204 mean6 <= mean5 ;
205 mean7 <= mean6 ;
206 mean8 <= mean7 ;
207
208 stdDev1 <= stddev I i n ;
209 stdDev2 <= stdDev1 ;
210 stdDev3 <= stdDev2 ;
211 stdDev4 <= stdDev3 ;
212 stdDev5 <= stdDev4 ;
213 stdDev6 <= stdDev5 ;
214 stdDev7 <= stdDev6 ;
215 stdDev8 <= stdDev7 ;
216 stdDev9 <= stdDev8 ;
217 stdDev10 <= stdDev9 ;
218 stdDev11 <= stdDev10 ;
219
220 −−p i p e l i n e v a l i d s i g n a l
221 for k in 1 to NUMBER LEVELS loop
222 i f ( k = 1) then
223 da ta va l i d ( k ) <= da t a v a l i d i n ;
224 else
225 da ta va l i d ( k ) <= data va l i d (k−1) ;
226 end i f ;
227 end loop ;
228
229 −− Check f o r nega t i v e va l u e s
230 i f ( d i f f t emp (31) = ’1 ’ ) then
100
231 d i f f <= ( others => ’ 0 ’ ) ;
232 else
233 d i f f <= di f f t emp ;
234 end i f ;
235
236 i f ( r e s u l t = "1" ) then −− cor r ec t ed i s >= 1
237 co r r e c t ed <= ONE;
238 else
239 co r r e c t ed <= corrected temp ;
240 end i f ;
241
242 −−not new data , keep output at one to pre se rve inner
product
243 i f ( da t a va l i d (NUMBER LEVELS−1) = ’1 ’ ) then
244 normal i zed out <= normal ized ;
245 else
246 normal i zed out <= ONE;
247 end i f ;
248
249 end i f ;
250 end process ;
251 end architecture ;
1 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 −−
3 −−! @ f i l e channel sum . vhd
4 −−! @br ie f Adds t o g e t h e r p a r a l l e l i npu t s
5 −−! @de t a i l s Compiles sum over number o f p a r a l l e l channe l s and
6 −−! adds in the 0 th c l a s s i f i c a t i o n c o e f f i c i e n t
7 −−! @author Monica Whitaker
8 −−! @date August 2016
9 −−! @copyright Copyright (C) 2016 Ross K. Snider and
10 −−! Monica Whitaker
11 −−
12 −− This program i s f r e e so f tware : you can r e d i s t r i b u t e i t and/or
13 −− modify i t under the terms o f the GNU General Pub l i c License
14 −− as pub l i s h ed by the Free Sof tware Foundation , e i t h e r ve r s i on
15 −− 3 o f the License , or ( at your opt ion ) any l a t e r ve r s i on .
16 −−
17 −− This program i s d i s t r i b u t e d in the hope t ha t i t w i l l be
18 −− use fu l , but WITHOUT ANY WARRANTY; wi thout even the imp l i ed
19 −− warranty o f MERCHANTABILITY or FITNESS FOR A PARTICULAR
20 −− PURPOSE. See the GNU General Pub l i c License f o r more d e t a i l s .
21 −−
22 −− You shou ld have r e c e i v ed a copy o f the GNU General Pub l i c
23 −− License a long wi th t h i s program . I f not , see <h t t p ://www. gnu
. org / l i c e n s e s />.
101
24 −−
25 −− Monica Whitaker
26 −− E l e c t r i c a l and Computer Engineer ing
27 −− Montana S ta t e Un i v e r s i t y
28 −− 610 Cob le i gh Ha l l
29 −− Bozeman , MT 59717
30 −− monica . whitaker@msu . montana . edu
31 −−
32 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
33 l ibrary IEEE ; −−! Use standard l i b r a r y .
34 use IEEE . STD LOGIC 1164 .ALL ; −−! Use standard l o g i c e lements .
35 use IEEE .NUMERIC STD.ALL ; −−! Use numeric s tandard .
36
37 use work . Sensor Package . a l l ; −−! Pro jec t cons tan t s package
f i l e
38 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
39 −−
40 −−! @br ie f channel sum
41 −−! @de t a i l s Compiles sum over number o f p a r a l l e l channe l s and
42 −−! adds in the 0 th c l a s s i f i c a t i o n c o e f f i c i e n t
43 −−! @param WORD SIZE Standard s i z e o f f l o a t i n g
po in t data
44 −−! @param c l k Input c l k f o r data ra t e
45 −−! @param f a s t c l k Input c l o c k running at t r i p l e
46 −−! the speed o f the c l k
47 −−! @param r s t n Act ive low r e s e t
48 −−! @param i n t e r c e p t i n 0 th c l a s s i f i c a t i o n
c o e f f i c i e n t
49 −−! @param da ta in Vector o f p r o b a b i l i t i e s
50 −−! @param de c i s i o n v e c t o r Sum of a l l p r o b a b i l i t i e s in
51 −−! da t a in and i n t e r c e p t i n
52 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
53 entity channel sum i s
54 generic (
55 WORD SIZE : natura l := 32
56 ) ;
57 port (
58 c l k : in s t d l o g i c ;
59 f a s t c l k : in s t d l o g i c ;
60 r s t n : in s t d l o g i c ;
61 i n t e r c e p t i n : in s t d l o g i c v e c t o r (WORD SIZE−1 downto
0) ;
62 data in : in s t d l o g i c v e c t o r (
NUMBEROF PARALLEL CHANNELS∗ WORD SIZE−1 downto 0) ;
63 r e s u l t o u t : out s t d l o g i c v e c t o r (WORD SIZE−1 downto 0)
64 ) ;
65 end entity ;
102
66
67 architecture r t l of channel sum i s
68
69 component fp func add i s −−3 c y c l e l a t ency
70 port (
71 a : in s t d l o g i c v e c t o r (31 downto 0) :=
72 ( others => ’ 0 ’ ) ;
73 a r e s e t : in s t d l o g i c := ’ 0 ’ ;
74 b : in s t d l o g i c v e c t o r (31 downto 0) :=
75 ( others => ’ 0 ’ ) ;
76 c l k : in s t d l o g i c := ’ 0 ’ ;
77 q : out s t d l o g i c v e c t o r (31 downto 0)
78 ) ;
79 end component fp func add ;
80
81 constant adde r l a t ency : natura l := 2 ;
82 constant comb ina t i on l e v e l s : natura l :=
NUMBEROF PARALLEL CHANNELS∗ adde r l a t ency ;
83
84 type data ar ray i s array (1 to comb ina t i on l e v e l s ) of
85 s t d l o g i c v e c t o r (NUMBEROF PARALLEL CHANNELS∗WORD SIZE−1
downto 0) ;
86 type answer array i s array (1 to NUMBEROF PARALLEL CHANNELS)
of s t d l o g i c v e c t o r (WORD SIZE−1 downto 0) ;
87
88 signal data de lay : data ar ray ;
89 signal output : answer array := ( others =>(others =>
’ 0 ’ ) ) ;
90 signal t emp re su l t s : answer array ;
91 signal r e s e t : s t d l o g i c ;
92
93 begin
94
95 r e s e t <= not r s t n ;
96
97 g adder : for j in 1 to NUMBEROF PARALLEL CHANNELS generate
98
99 i add fp func add : fp func add
100 port map( a => t emp re su l t s ( j ) ,
101 a r e s e t => r e s e t ,
102 b => data de lay ( adder l a t ency ∗( j−1)+1)
103 (NUMBEROF PARALLEL CHANNELS∗
WORD SIZE−(WORD SIZE∗( j−1) )−1
downto NUMBEROF PARALLEL CHANNELS
∗WORD SIZE−WORD SIZE∗ j ) ,
104 c l k => f a s t c l k ,
105 q => output ( j )
103
106 ) ;
107
108 end generate ;
109
110 p i p e l i n e : process ( c lk , r s t n )
111 begin
112 i f ( r s t n = ’0 ’ ) then
113 r e s u l t o u t <= ( others => ’ 0 ’ ) ;
114 e l s i f ( r i s i n g e d g e ( c l k ) ) then
115 for k in 1 to comb ina t i on l e v e l s loop
116 i f ( k = 1) then
117 data de lay (k ) <= data in ;
118 else
119 data de lay (k ) <= data de lay (k−1) ;
120 end i f ;
121 end loop ;
122 for j in 1 to NUMBEROF PARALLEL CHANNELS loop
123 i f ( j = 1) then
124 t emp re su l t s ( j ) <= in t e r c e p t i n ;
125 else
126 t emp re su l t s ( j ) <= output ( j−1) ;
127 end i f ;
128 end loop ;
129 r e s u l t o u t <= output (NUMBEROF PARALLEL CHANNELS) ;
130 end i f ;
131 end process ;
132 end architecture ;
1 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 −−
3 −−! @ f i l e s o r t . vhd
4 −−! @br ie f Sor t s p a r a l l e l i npu t s in descending order
5 −−! @de t a i l s Sor t s input in two c l o c k c y c l e s and ou tpu t s
6 −−! s o r t ed index numbers in add i t i on to so r t ed
7 −−! r e s u l t s
8 −−! @author Monica Whitaker
9 −−! @date August 2016
10 −−! @copyright Copyright (C) 2016 Ross K. Snider and
11 −−! Monica Whitaker
12 −−
13 −− This program i s f r e e so f tware : you can r e d i s t r i b u t e i t and/or
14 −− modify i t under the terms o f the GNU General Pub l i c License
15 −− as pub l i s h ed by the Free Sof tware Foundation , e i t h e r ve r s i on
16 −− 3 o f the License , or ( at your opt ion ) any l a t e r ve r s i on .
17 −−
18 −− This program i s d i s t r i b u t e d in the hope t ha t i t w i l l be
19 −− use fu l , but WITHOUT ANY WARRANTY; wi thout even the imp l i ed
104
20 −− warranty o f MERCHANTABILITY or FITNESS FOR A PARTICULAR
21 −− PURPOSE. See the GNU General Pub l i c License f o r more d e t a i l s .
22 −−
23 −− You shou ld have r e c e i v ed a copy o f the GNU General Pub l i c
24 −− License a long wi th t h i s program . I f not , see <h t t p ://www. gnu
. org / l i c e n s e s />.
25 −−
26 −− Monica Whitaker
27 −− E l e c t r i c a l and Computer Engineer ing
28 −− Montana S ta t e Un i v e r s i t y
29 −− 610 Cob le i gh Ha l l
30 −− Bozeman , MT 59717
31 −− monica . whitaker@msu . montana . edu
32 −−
33 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
34 l ibrary IEEE ; −−! Use standard l i b r a r y .
35 use IEEE . STD LOGIC 1164 .ALL; −−! Use standard l o g i c e lements .
36 use IEEE .NUMERIC STD.ALL; −−! Use numeric s tandard .
37 use IEEE .MATHREAL.ALL; −−! Use r e a l math l i b r a r y
38
39 use work . Sensor Package . a l l ; −−! Pro jec t cons tan t s package
f i l e
40 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
41 −−
42 −−! @br ie f s o r t
43 −−! @de t a i l s Sor t s input in two c l o c k c y c l e s and ou tpu t s
44 −−! s o r t ed index numbers in add i t i on to so r t ed
45 −−! r e s u l t s
46 −−! @param WORD SIZE Standard s i z e o f f l o a t i n g
po in t data
47 −−! @param c l k Input c l k f o r data ra t e
48 −−! @param r s t n Act ive low r e s e t
49 −−! @param u l i s t i n Unsorted vec t o r o f va l u e s
50 −−! @param s l i s t o u t Sorted vec t o r o f va l u e s
51 −−! @param s l i s t i n d i c e s o u t Vector o f i n d i c e s o f s o r t ed
va l u e s in
52 −−! s o r t ed order
53 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
54
55 entity s o r t i s
56 generic (
57 WORD SIZE : natura l := 32
58 ) ;
59 port (
60 c l k : in s t d l o g i c ;
61 r s t n : in s t d l o g i c ;
105
62 u l i s t i n : in s t d l o g i c v e c t o r (
NUMBER OF CLASSES∗WORD SIZE −1 downto 0) ;
63 s l i s t o u t : out s t d l o g i c v e c t o r (
NUMBER OF CLASSES∗ WORD SIZE−1 downto 0) ;
64 s l i s t i n d i c e s o u t : out s t d l o g i c v e c t o r (
NUMBER OF CLASSES∗ natura l ( trunc ( log2 ( r e a l (
NUMBER OF CLASSES) ) ) )−1 downto 0)
65 ) ;
66 end entity ;
67
68 architecture r t l of s o r t i s
69
70 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
71 −− Component De f i n i t i o n s
72 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
73 component gt compare i s −−a > b −−> q = 1
74 port (
75 a : in s t d l o g i c v e c t o r (31 downto 0) := ( others
=> ’ 0 ’ ) ;
76 a r e s e t : in s t d l o g i c := ’ 0 ’ ;
77 b : in s t d l o g i c v e c t o r (31 downto 0) := ( others
=> ’ 0 ’ ) ;
78 c l k : in s t d l o g i c := ’ 0 ’ ;
79 q : out s t d l o g i c v e c t o r (0 downto 0)
80 ) ;
81 end component gt compare ;
82 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
83 −− Constant De f i n i t i o n s
84 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
85 constant INDEX BITS : natura l := natura l ( trunc ( log2 ( r e a l (
86 NUMBER OF CLASSES) ) ) ) ;
87 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
88 −− Type De f i n i t i o n s
89 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
90 type l i s t a r r a y i s array (1 to NUMBER OF CLASSES)
of s t d l o g i c v e c t o r (31 downto 0) ;
91 type po s i t i o n a r r a y i s array (1 to NUMBER OF CLASSES)
of i n t e g e r range 0 to NUMBER OF CLASSES;
92 type r e s u l t a r r a y i s array (1 to NUMBER OF CLASSES)
of s t d l o g i c ;
93 type r e su l t expand a r r ay i s array (1 to NUMBER OF CLASSES)
of r e s u l t a r r a y ;
94 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
95 −− S igna l De f i n i t i o n s
96 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
97 signal unsorted , unso r t ed reg : l i s t a r r a y ;
98 signal r e s u l t : r e su l t expand a r r ay ;
106
99 signal s o r t ed index : p o s i t i o n a r r a y ;
100 signal r e s e t : s t d l o g i c ;
101
102 begin
103
104 r e s e t <= not r s t n ;
105
106 g compare : for j in 1 to NUMBER OF CLASSES generate
107
108 unsorted ( j ) <= u l i s t i n ( (NUMBER OF CLASSES−( j−1) ) ∗
WORD SIZE−1 downto (NUMBER OF CLASSES−j ) ∗WORD SIZE) ;
109
110 g inner compare : for k in 1 to NUMBER OF CLASSES
generate
111 i compare : gt compare
112 port map(
113 a => unsorted ( j ) ,
114 a r e s e t => r e s e t ,
115 b => unsorted (k ) ,
116 c l k => c lk ,
117 q (0 ) => r e s u l t ( j ) ( k )
118 ) ;
119 end generate ;
120
121 end generate ;
122
123 process ( c lk , r s t n )
124 variable sum index : p o s i t i o n a r r a y ;
125 begin
126 i f ( r s t n = ’0 ’ ) then
127 s o r t ed index <= ( others => 0) ;
128 sum index := ( others => 0) ;
129 s l i s t i n d i c e s o u t <= ( others => ’ 0 ’ ) ;
130 s l i s t o u t <= ( others => ’ 0 ’ ) ;
131 e l s i f ( r i s i n g e d g e ( c l k ) ) then
132 unso r t ed reg <= unsorted ;
133 sum index := ( others => 0) ;
134 for j in 1 to NUMBER OF CLASSES loop
135 for k in 1 to NUMBER OF CLASSES loop
136 i f ( k >= j+1) then
137 i f ( r e s u l t ( j ) ( k ) = ’1 ’ ) then
138 sum index ( j ) := sum index ( j ) + 1 ;
139 else
140 sum index (k ) := sum index (k ) + 1 ;
141 end i f ;
142 end i f ;
143 end loop ;
107
144 s o r t ed index ( j ) <= sum index ( j ) − 1 ; −−s t a r t from
0
145 s l i s t i n d i c e s o u t (INDEX BITS∗(NUMBER OF CLASSES−
146 ( s o r t ed index ( j ) ) )−1 downto INDEX BITS∗(
147 NUMBER OF CLASSES−( s o r t ed index ( j )+1) ) ) <=
148 s t d l o g i c v e c t o r ( to uns igned ( j , INDEX BITS) ) ;
149 −−ordered l e a s t to g r e a t e s t
150 s l i s t o u t (WORD SIZE∗(NUMBER OF CLASSES−
s o r t ed index ( j ) )−1
151 downto WORD SIZE∗(NUMBER OF CLASSES−(
s o r t ed index ( j )+1) ) )
152 <= unsor t ed reg ( j ) ;
153 end loop ;
154 end i f ;
155 end process ;
156 end architecture ;
1 −−
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 −−
3 −−! @ f i l e o b j e c t t r a c k i n g . vhd
4 −−! @br ie f Bui ld s up c l a s s i f i c a t i o n based on o b j e c t edges
5 −−! @de t a i l s Uses input from hype r s p e c t r a l c l a s s i f i c a t i o n s and
6 −−! monochrome edge d e t e c t i on to compi le o b j e c t
7 −−! c l a s s i f i c a t i o n s over the i d e n t i f i e d p i x e l s .
8 −−! @author Monica Whitaker
9 −−! @date August 2016
10 −−! @copyright Copyright (C) 2016 Ross K. Snider and
11 −−! Monica Whitaker
12 −−
13 −− This program i s f r e e so f tware : you can r e d i s t r i b u t e i t and/or
14 −− modify i t under the terms o f the GNU General Pub l i c License
15 −− as pub l i s h ed by the Free Sof tware Foundation , e i t h e r ve r s i on
16 −− 3 o f the License , or ( at your opt ion ) any l a t e r ve r s i on .
17 −−
18 −− This program i s d i s t r i b u t e d in the hope t ha t i t w i l l be
19 −− use fu l , but WITHOUT ANY WARRANTY; wi thout even the imp l i ed
20 −− warranty o f MERCHANTABILITY or FITNESS FOR A PARTICULAR
21 −− PURPOSE. See the GNU General Pub l i c License f o r more d e t a i l s .
22 −−
23 −− You shou ld have r e c e i v ed a copy o f the GNU General Pub l i c
24 −− License a long wi th t h i s program . I f not , see <h t t p ://www. gnu
. org / l i c e n s e s />.
25 −−
26 −− Monica Whitaker
27 −− E l e c t r i c a l and Computer Engineer ing
108
28 −− Montana S ta t e Un i v e r s i t y
29 −− 610 Cob le i gh Ha l l
30 −− Bozeman , MT 59717
31 −− monica . whitaker@msu . montana . edu
32 −−
33 −−
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
34 l ibrary IEEE ; −−! Use standard l i b r a r y .
35 use IEEE . STD LOGIC 1164 .ALL ; −−! Use standard l o g i c e lements .
36 use IEEE .NUMERIC STD.ALL ; −−! Use numeric s tandard .
37 use IEEE .MATHREAL.ALL; −−! Use r e a l math l i b r a r y
38
39 use work . Sensor Package . a l l ; −−! Pro jec t cons tan t s package
f i l e
40 −−
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
41 −−
42 −−! @br ie f o b j e c t t r a c k i n g
43 −−! @de t a i l s Uses input from hype r s p e c t r a l c l a s s i f i c a t i o n s and
44 −−! monochrome edge d e t e c t i on to compi le o b j e c t
45 −−! c l a s s i f i c a t i o n s over the i d e n t i f i e d p i x e l s .
46 −−! Keeps array o f o b j e c t numbers based on p i x e l
47 −−! number .
48 −−! @param MAXOBJECTNUMBER Maximum number o f o b j e c t s
49 −−! p o s s i b l e a t any one time
50 −−! @param WORD SIZE Standard s i z e o f f l o a t i n g
po in t data
51 −−! @param l i n e s c a n c l k Input c l k from transmiss ion
o f
52 −−! monochrome data
53 −−! @param da t a c l k Input c l o c k from
hype r s p e c t r a l
54 −−! c l a s s i f i c a t i o n
55 −−! @param f a s t c l k Input c l o c k running at t r i p l e
56 −−! the speed o f the d a t a c l k
57 −−! @param r s t n Act ive low r e s e t
58 −−! @param l i n e r s t n Act ive low r e s e t f o r
l i n e s c a n c l k domain
59 −−! @param l i n e s c an o b j Informat ion about o b j e c t
60 −−! l o c a t i o n from l i n e s can camera
61 −−! Contains l i n e number , o b j e c t
62 −−! number , s t a r t p i x e l , end
63 −−! p i x e l
64 −−! @param new re su l t s Flag to i n d i c a t e new
65 −−! h y p e r s p e c t r a l p i x e l r e s u l t s
109
66 −−! @param c l a s s r e s u l t s i n Hyper spec t ra l r e s u l t s v e c t o r
67 −−! o f c l a s s p r o b a b i l i t i e s wi th
68 −−! p i x e l number
69 −−! @param de c i s i o n v e c t o r Vector o f o v e r a l l
70 −−! p r o b a b i l i t i e s f o r c l a s s e s
71 −−! and o b j e c t number .
72 −−
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
73 entity ob j e c t t r a c k i n g i s
74 generic (MAXOBJECTNUMBER : natura l := 64 ;
75 WORD SIZE : natura l := 32
76 ) ;
77 port ( l i n e s c a n c l k : in s t d l o g i c ;
78 da ta c l k : in s t d l o g i c ;
79 f a s t c l k : in s t d l o g i c ;
80 r s t n : in s t d l o g i c ;
81 l i n e r s t n : in s t d l o g i c ;
82 l i n e s c a n ob j : in s t d l o g i c v e c t o r (
PIXEL ADDRESS SIZE∗2+OBJECT ADDRESS SIZE+WORD SIZE−1
downto 0) ;
83 new re su l t s : in s t d l o g i c ;
84 c l a s s r e s u l t s i n : in s t d l o g i c v e c t o r (
NUMBER OF CLASSES∗WORD SIZE+PIXEL ADDRESS SIZE−1
downto 0) ;
85 d e c i s i o n v e c t o r : out s t d l o g i c v e c t o r (
NUMBER OF CLASSES∗WORD SIZE+OBJECT ADDRESS SIZE−1
downto 0)
86 ) ;
87 end entity ;
88
89 architecture arch of ob j e c t t r a c k i n g i s
90
91 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
92 −− Component De f i n i t i o n s
93 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
94 component memory block i s
95 generic (
96 num elements a : natura l ;
97 num elements b : natura l ;
98 s i z e a dd r e s s a : natura l ;
99 s i z e a dd r e s s b : natura l ;
100 s i z e word a : natura l ;
101 s i z e word b : natura l ;
102 mem init : s t r i n g := "UNUSED"
103 ) ;
104 port (
110
105 addre s s a : in s t d l o g i c v e c t o r ( s i z e add r e s s a −1
downto 0) ;
106 addres s b : in s t d l o g i c v e c t o r ( s i z e add r e s s b −1
downto 0) ;
107 c l o ck a : in s t d l o g i c := ’ 1 ’ ;
108 c l o ck b : in s t d l o g i c := ’ 1 ’ ;
109 data a : in s t d l o g i c v e c t o r ( s i z e word a−1
downto 0) ;
110 data b : in s t d l o g i c v e c t o r ( s i ze word b−1
downto 0) ;
111 wren a : in s t d l o g i c := ’ 0 ’ ;
112 wren b : in s t d l o g i c := ’ 0 ’ ;
113 q a : out s t d l o g i c v e c t o r ( s i z e word a−1
downto 0) ;
114 q b : out s t d l o g i c v e c t o r ( s i ze word b−1
downto 0)
115 ) ;
116 end component memory block ;
117
118 component fp func add i s −−3 c y c l e l a t ency
119 port (
120 a : in s t d l o g i c v e c t o r (31 downto 0) := ( others =>
’ 0 ’ ) ;
121 a r e s e t : in s t d l o g i c := ’ 0 ’ ;
122 b : in s t d l o g i c v e c t o r (31 downto 0) := ( others =>
’ 0 ’ ) ;
123 c l k : in s t d l o g i c := ’ 0 ’ ;
124 q : out s t d l o g i c v e c t o r (31 downto 0)
125 ) ;
126 end component fp func add ;
127
128 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
129 −− Constant De f i n i t i o n s
130 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
131 constant LINESCAN INPUT SIZE : natura l := WORD SIZE +
OBJECT ADDRESS SIZE + PIXEL ADDRESS SIZE∗2 ;
132 −−Valid va l u e s = 1 ,2 ,4 ,8 ,16
133 constant MEMORYRATIO : natura l := NUMBER OF CLASSES;
134 constant CLASS NUMBER : natura l := natura l ( trunc ( log2 ( r e a l (
NUMBER OF CLASSES) ) ) ) ;
135 constant ZEROS : s t d l o g i c v e c t o r (MEMORYRATIO∗WORD SIZE−1
downto 0) := ( others => ’ 0 ’ ) ;
136
137 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
138 −− Type De f i n i t i o n s
139 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
111
140 type p i x e l a r r a y i s array (0 to NUMBER OF PIXELS) of i n t e g e r
range 0 to MAXOBJECTNUMBER;
141
142 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
143 −− S igna l De f i n i t i o n s
144 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
145 signal p i x e l t r a c k e r , p r e v i o u s l i n e p i x e l s , p a s t l i n e :
p i x e l a r r a y ;
146 signal r e s e t : s t d l o g i c ;
147 signal f rame count : s t d l o g i c v e c t o r (WORD SIZE−1
downto 0) ;
148 signal r e g r e s s l i n e : s t d l o g i c v e c t o r (WORD SIZE−1
downto 0) ;
149 signal r eg reg , r e g l a t c h : s t d l o g i c v e c t o r (WORD SIZE−1
downto 0) ;
150 signal p a s t l i n e s c a n l i n e : s t d l o g i c v e c t o r (WORD SIZE−1
downto 0) ;
151
152 signal update mem address : s t d l o g i c v e c t o r (
OBJECT ADDRESS SIZE−1 downto 0) ;
153 signal dec i s i on ve c to r t emp : s t d l o g i c v e c t o r (MEMORYRATIO∗
WORD SIZE−1 downto 0) ;
154 signal mem write : s t d l o g i c ;
155 signal o b j e c t c l e a r w r i t e : s t d l o g i c ;
156 signal r eady wr i t e : s t d l o g i c ;
157 signal r eady wr i t e2 : s t d l o g i c ;
158 signal mem pixel : s t d l o g i c v e c t o r (
NUMBER OF CLASSES∗WORD SIZE−1 downto 0) ;
159 signal combined pixe l : s t d l o g i c v e c t o r (
NUMBER OF CLASSES∗WORD SIZE−1 downto 0) ;
160 signal new pix add : s t d l o g i c v e c t o r (
NUMBER OF CLASSES∗WORD SIZE−1 downto 0) ;
161 signal new pixe l : s t d l o g i c v e c t o r (
NUMBER OF CLASSES∗WORD SIZE−1 downto 0) ;
162
163 signal output mem address : s t d l o g i c v e c t o r (
OBJECT ADDRESS SIZE−1 downto 0) ;
164 signal out object num : s t d l o g i c v e c t o r (
OBJECT ADDRESS SIZE−1 downto 0) ;
165 signal newlinenum : s t d l o g i c v e c t o r (WORD SIZE−1
downto 0) ;
166 signal p ix s t a r t , pixend : s t d l o g i c v e c t o r (
PIXEL ADDRESS SIZE−1 downto 0) ;
167
168 signal object num : s t d l o g i c v e c t o r (
OBJECT ADDRESS SIZE−1 downto 0) ;
169
112
170 signal s t a r t l i n e : s t d l o g i c ;
171 signal r e g s t a r t l i n e : s t d l o g i c ;
172 signal r e g s t a r t l i n e 2 : s t d l o g i c ;
173
174 attribute noprune : boolean ;
175 attribute noprune of p i x e l t r a c k e r : signal i s t rue ;
176
177 begin
178
179 ASSERT (MEMORYRATIO >= NUMBER OF CLASSES)
180 report "Invalid number of classes for memory block"
181 severity e r r o r ;
182
183 r e s e t <= not r s t n ;
184
185 i c l a s s r e s u l t mem : memory block −−update on b , output on a
186 generic map(
187 num elements a => MAXOBJECTNUMBER,
188 num elements b => MAXOBJECTNUMBER,
189 s i z e a dd r e s s a => OBJECT ADDRESS SIZE,
190 s i z e a dd r e s s b => OBJECT ADDRESS SIZE,
191 s i z e word a => NUMBER OF CLASSES ∗ WORD SIZE,
192 s i z e word b => NUMBER OF CLASSES ∗ WORD SIZE,
193 mem init => "UNUSED"
194 )
195 port map(
196 addre s s a => output mem address ,
197 addres s b => update mem address ,
198 c l o ck a => data c lk ,
199 c l o ck b => data c lk ,
200 data a => ( others => ’ 0 ’ ) ,
201 data b => combined pixe l ,
202 wren a => ob j e c t c l e a r w r i t e ,
203 wren b => mem write ,
204 q a => dec i s i on vec to r t emp , −−r e g i s t e r e d
205 q b => mem pixel
206 ) ;
207
208 accumulate : for k in 1 to NUMBER OF CLASSES generate
209
210 i add fp func add : fp func add
211 port map(
212 a => mem pixel (WORD SIZE∗(NUMBER OF CLASSES−(k−1) )
−1 downto WORD SIZE∗(NUMBER OF CLASSES−k ) ) ,
213 a r e s e t => r e s e t ,
214 b => new pix add (WORD SIZE∗(NUMBER OF CLASSES−(k−1)
)−1 downto WORD SIZE∗(NUMBER OF CLASSES−k ) ) ,
113
215 c l k => f a s t c l k ,
216 q => combined pixe l (WORD SIZE∗(NUMBER OF CLASSES−(k
−1) )−1 downto WORD SIZE∗(NUMBER OF CLASSES−k ) )
217 ) ;
218
219 end generate ;
220
221
222 −−input from l i n e s can < l i n e#, o b j e c t#, s t a r t pix , end pix>
223 a c c e p t p i x e l s : process ( l i n e s c an c l k , l i n e r s t n )
224 variable c u r r e n t l i n e s c a n l i n e : s t d l o g i c v e c t o r (
WORD SIZE−1 downto 0) ;
225 begin
226 i f ( l i n e r s t n = ’0 ’ ) then
227 p r e v i o u s l i n e p i x e l s <= ( others => 0) ;
228 p i x e l t r a c k e r <= ( others => 0) ;
229 r e g r e s s l i n e <= ( others => ’ 0 ’ ) ;
230 p a s t l i n e s c a n l i n e <= ( others => ’ 0 ’ ) ;
231 e l s i f ( r i s i n g e d g e ( l i n e s c a n c l k ) ) then
232 −−l i n e coun t r e s e t = <zeros , ones , 0 ,NUMBER OF PIXELS−1>
233 i f ( l i n e s c a n ob j (PIXEL ADDRESS SIZE−1 downto 0) =
234 s t d l o g i c v e c t o r ( to uns igned (NUMBER OF PIXELS−1,
235 PIXEL ADDRESS SIZE) ) and l i n e s c a n ob j (
PIXEL ADDRESS SIZE ∗
236 2 − 1 downto PIXEL ADDRESS SIZE) =
s t d l o g i c v e c t o r (
237 to uns igned (0 ,PIXEL ADDRESS SIZE) ) ) then
238 s t a r t l i n e <= ’1 ’ ;
239 else
240 s t a r t l i n e <= ’0 ’ ;
241 c u r r e n t l i n e s c a n l i n e := l i n e s c a n ob j (
LINESCAN INPUT SIZE−1 downto
LINESCAN INPUT SIZE−WORD SIZE) ;
242 object num <= l i n e s c a n ob j (LINESCAN INPUT SIZE−
WORD SIZE−1 downto PIXEL ADDRESS SIZE∗2) ;
243 pixend <= l i n e s c a n ob j (PIXEL ADDRESS SIZE−1
downto 0) ;
244 p i x s t a r t <= l i n e s c a n ob j (PIXEL ADDRESS SIZE
∗2−1 downto PIXEL ADDRESS SIZE) ;
245 newlinenum <= l i n e s c a n ob j (LINESCAN INPUT SIZE−1
downto LINESCAN INPUT SIZE−WORD SIZE) ;
246
247 i f ( unsigned ( c u r r e n t l i n e s c a n l i n e ) /=
248 unsigned ( p a s t l i n e s c a n l i n e ) ) then
249 −−new l i n e
250 p r e v i o u s l i n e p i x e l s <= p i x e l t r a c k e r ;
251 p i x e l t r a c k e r <= ( others => 0) ;
114
252 end i f ;
253
254 for k in 1 to NUMBER OF PIXELS loop
255 exit when k = unsigned ( pixend ) + 1 ;
256 i f ( k >= unsigned ( p i x s t a r t ) and k <= unsigned (
pixend ) )
257 then
258 −−OBJECT NUMBER;
259 p i x e l t r a c k e r ( k ) <= to i n t e g e r ( unsigned (
object num ) ) ;
260 end i f ;
261 end loop ;
262 p a s t l i n e s c a n l i n e <= cu r r e n t l i n e s c a n l i n e ;
263 r e g r e s s l i n e <= newlinenum ;
264 end i f ;
265 end i f ;
266 end process ;
267
268
269 −−one p i x e l r e s u l t a t a time , j u s t add in as needed !
270 −−INPUT = <p ix#, c l a s s#, c l a s s r e s u l t>
271 process ( data c lk , r s t n )
272 variable pixel num : i n t e g e r range 0 to
NUMBER OF PIXELS−1;
273 variable c u r r e n t l i n e : p i x e l a r r a y ;
274 variable regress f rame num : s t d l o g i c v e c t o r (WORD SIZE
−1 downto 0) ;
275 begin
276 i f ( r s t n = ’0 ’ ) then
277 new pixe l <= ( others => ’ 0 ’ ) ;
278 regress f rame num := ( others => ’ 0 ’ ) ;
279 update mem address <= ( others => ’ 0 ’ ) ;
280 output mem address <= ( others => ’ 0 ’ ) ;
281 r eady wr i t e <= ’0 ’ ;
282 e l s i f ( r i s i n g e d g e ( da ta c l k ) ) then
283 r e g s t a r t l i n e 2 <= s t a r t l i n e ;
284 r e g s t a r t l i n e <= r e g s t a r t l i n e 2 ;
285
286 r e g r e g <= r e g r e s s l i n e ;
287 r e g l a t c h <= reg r e g ;
288
289 i f ( r e g s t a r t l i n e = ’1 ’ ) then
290 regress f rame num := ( others => ’ 0 ’ ) ;
291 e l s i f ( n ew re su l t s = ’1 ’ ) then
292 pixel num := t o i n t e g e r ( unsigned ( c l a s s r e s u l t s i n
(NUMBER OF CLASSES∗WORD SIZE+
115
PIXEL ADDRESS SIZE−1 downto NUMBER OF CLASSES∗
WORD SIZE) ) ) ;
293 i f ( pixel num = 0) then
294 regress f rame num := s t d l o g i c v e c t o r (
unsigned ( regress f rame num ) + 1) ;
295 p a s t l i n e <= cu r r e n t l i n e ;
296 i f ( unsigned ( regress f rame num ) = unsigned (
r e g l a t c h ) ) then
297 c u r r e n t l i n e := p i x e l t r a c k e r ;
298 else
299 c u r r e n t l i n e := p r e v i o u s l i n e p i x e l s ;
300 end i f ;
301 end i f ;
302 end i f ;
303
304 i f ( n ew re su l t s = ’1 ’ ) then
305 i f ( pixel num > 0 and pixel num < NUMBER OF PIXELS
−1) then
306 i f ( c u r r e n t l i n e ( pixel num−1) /= 0 and
307 c u r r e n t l i n e ( pixel num ) /= 0 and
308 c u r r e n t l i n e ( pixel num+1) /= 0) then
309 −−read from memory , add toge ther , re−
wr i t e to memory
310 new pixe l <= c l a s s r e s u l t s i n (
NUMBER OF CLASSES∗
311 WORD SIZE−1 downto 0) ;
312
313 update mem address <= s t d l o g i c v e c t o r (
to uns igned
314 ( c u r r e n t l i n e ( pixel num ) ,
OBJECT ADDRESS SIZE) ) ;
315 r eady wr i t e <= ’1 ’ ;
316 e l s i f ( c u r r e n t l i n e ( pixel num−1) = 0 and
317 p a s t l i n e ( pixel num−1) /= 0) then
318
319 i f ( c u r r e n t l i n e ( pixel num ) = 0 and
320 p a s t l i n e ( pixel num ) /= 0) then
321
322 output mem address <=
s t d l o g i c v e c t o r (
323 to uns igned ( p a s t l i n e ( pixel num ) ,
324 OBJECT ADDRESS SIZE) )
;
325 end i f ;
326 r eady wr i t e <= ’0 ’ ;
327 new pixe l <= ( others => ’ 0 ’ ) ;
328 else
116
329 r eady wr i t e <= ’0 ’ ;
330 new pixe l <= ( others => ’ 0 ’ ) ;
331 end i f ;
332 else
333 r eady wr i t e <= ’0 ’ ;
334 new pixe l <= ( others => ’ 0 ’ ) ;
335 end i f ;
336 else
337 new pixe l <= ( others => ’ 0 ’ ) ;
338 r eady wr i t e <= ’0 ’ ;
339
340 end i f ;
341 new pix add <= new pixe l ;
342 r eady wr i t e2 <= ready wr i t e ;−−p i p e l i n e wh i l e adder
opera t e s
343 mem write <= ready wr i t e2 ;
344 end i f ;
345 end process ;
346
347 output proc : process ( data c lk , r s t n )
348 begin
349 i f ( r s t n = ’0 ’ ) then
350 d e c i s i o n v e c t o r <= ( others => ’ 0 ’ ) ;
351 e l s i f ( r i s i n g e d g e ( da ta c l k ) ) then
352 out object num <= output mem address ;
353 i f ( d e c i s i on ve c to r t emp /= ZEROS) then
354 d e c i s i o n v e c t o r <= out object num &
dec i s i on vec to r t emp ;
355 o b j e c t c l e a r w r i t e <= ’1 ’ ;
356 else
357 d e c i s i o n v e c t o r <= ( others => ’ 0 ’ ) ;
358 o b j e c t c l e a r w r i t e <= ’0 ’ ;
359 end i f ;
360 end i f ;
361 end process ;
362
363 end architecture ;
1 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 −−
3 −−! @ f i l e DRAM controller . vhd
4 −−! @br ie f The master d r i v e r to p u l l data from DRAM.
5 −−! @de t a i l s Passes bu r s t i n g reads from DRAM through b u f f e r
6 −−! f o r use by system
7 −−! @author Monica Whitaker
8 −−! @date October 2015
9 −−! @copyright Copyright (C) 2015 Ross K. Snider and
117
10 −−! Monica Whitaker
11 −−
12 −− This program i s f r e e so f tware : you can r e d i s t r i b u t e i t and/or
13 −− modify i t under the terms o f the GNU General Pub l i c License
14 −− as pub l i s h ed by the Free Sof tware Foundation , e i t h e r ve r s i on
15 −− 3 o f the License , or ( at your opt ion ) any l a t e r ve r s i on .
16 −−
17 −− This program i s d i s t r i b u t e d in the hope t ha t i t w i l l be
18 −− use fu l , but WITHOUT ANY WARRANTY; wi thout even the imp l i ed
19 −− warranty o f MERCHANTABILITY or FITNESS FOR A PARTICULAR
20 −− PURPOSE. See the GNU General Pub l i c License f o r more d e t a i l s .
21 −−
22 −− You shou ld have r e c e i v ed a copy o f the GNU General Pub l i c
23 −− License a long wi th t h i s program . I f not , see <h t t p ://www. gnu
. org / l i c e n s e s />.
24 −−
25 −− Monica Whitaker
26 −− E l e c t r i c a l and Computer Engineer ing
27 −− Montana S ta t e Un i v e r s i t y
28 −− 610 Cob le i gh Ha l l
29 −− Bozeman , MT 59717
30 −− monica . whitaker@msu . montana . edu
31 −−
32 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
33 l ibrary IEEE ;
34 use IEEE . STD LOGIC 1164 .ALL;
35 use i e e e . numer ic std . a l l ; −−! Use numeric s tandard
36 use i e e e . math rea l . a l l ;
37
38 use work . Sensor Package .ALL;
39 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
40 −−
41 −−! @br ie f DRAM controller
42 −−! @de t a i l s Passes bu r s t i n g reads from DRAM through b u f f e r
43 −−! f o r use by system
44 −−! @param memory clk Input r e f c l o c k
f o r DDR
45 −−! @param sy s t em c l k Buf fer data
output c l o c k
46 −−! @param r s t n Act ive low r e s e t
47 −−! @param avm read master read Master Read
enab l e
48 −−! @param avm read master address Master address
49 −−! @param avm read master burs tcount Master bur s t coun t
50 −−! @param avm read master readdata Master readdata
51 −−! @param avm read mas ter readda tava l i d Master data v a l i d
118
52 −−! @param avm read mas ter wa i t reques t Master read
wa i t r e que s t
53 −−! @param avm wr i t e mas te r wr i t e Master wr i t e
enab l e
54 −−! @param avm wri te mas ter address Master wr i t e
address
55 −−! @param avm wr i t e mas te r wr i t eda ta Master wr i t eda ta
56 −−! @param avm wr i t e mas t e r wa i t r e que s t Master wr i t e
wa i t r e que s t
57 −−! @param a v s c s r w r i t e S lave wr i t e
enab l e
58 −−! @param av s c s r a dd r e s s S lave wr i t e
address
59 −−! @param av s c s r w r i t e d a t a S lave wr i t eda ta
60 −−! @param av s c s r wa i t r e q u e s t S lave wr i t e
wa i t r e que s t
61 −−! @param w r i t e c l k Output o f
memory clk
62 −−! @param r e a d s t a r t Enable read ing
from DDR
63 −−! @param bu f f e r r e a d en Read enab l e f o r
FIFO
64 −−! @param bu f f e r empty FIFO empty
65 −−! @param bu f f e r r e a dda t a FIFO readdata
66 −−
67 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
68 entity DRAM controller i s
69 port ( memory clk : in s t d l o g i c ;
70 sys t em c lk : in s t d l o g i c ;
71 r s t n : in s t d l o g i c ;
72
73 −−read master s i g n a l s
74 avm read master read : out s t d l o g i c ;
75 avm read master address : out s t d l o g i c v e c t o r (31
downto 0) ;
76 avm read master burstcount : out s t d l o g i c v e c t o r (5
downto 0) ;
77 avm read master readdata : in s t d l o g i c v e c t o r (127
downto 0) ;
78 avm read master readdatava l id : in s t d l o g i c ;
79 avm read master wai t request : in s t d l o g i c ;
80
81 −−wr i t e master s i g n a l s −− debug wr i t i n g s i g n a l s
82 avm wr i te master wr i t e : out s t d l o g i c ;
83 avm wri te master address : out s t d l o g i c v e c t o r (31
downto 0) ;
119
84 avm wri te master wr i tedata : out s t d l o g i c v e c t o r (127
downto 0) ;
85 avm wr i te maste r wa i t reques t : in s t d l o g i c ;
86
87 −−expor t s i g n a l s f o r wr i t i n g
88 a v s c s r w r i t e : in s t d l o g i c ;
89 av s c s r add r e s s : in s t d l o g i c v e c t o r (31
downto 0) ;
90 av s c s r w r i t e d a t a : in s t d l o g i c v e c t o r (127
downto 0) ;
91 av s c s r wa i t r e qu e s t : out s t d l o g i c ;
92 wr i t e c l k : out s t d l o g i c ;
93
94 −−condui t expor t s i g n a l s
95 r e a d s t a r t : in s t d l o g i c ; −− 1 i f wr i t e done
96 bu f f e r r e ad en : in s t d l o g i c ;
97 buf fer empty : out s t d l o g i c ;
98 bu f f e r r e adda ta : out s t d l o g i c v e c t o r (127 downto
0)
99 ) ;
100 end entity ;
101
102 architecture c o n t r o l l e r a r c h of DRAM controller i s
103 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
104 −− Component De f i n i t i o n s
105 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
106 component d u a l c l o c k f i f o i s
107 generic (
108 lpm numwords : natura l ;
109 lpm width : natura l ;
110 lpm widthu : natura l ;
111 rd sync de layp ipe : natura l ;
112 under f l ow check ing : s t r i n g ;
113 wrsync de layp ipe : natura l ) ;
114 port (
115 data : in s t d l o g i c v e c t o r ( lpm width − 1
downto 0)
116 := ( others => ’X’ ) ;
117 wrreq : in s t d l o g i c := ’X’ ;
118 rdreq : in s t d l o g i c := ’X’ ;
119 wrclk : in s t d l o g i c := ’X’ ;
120 rdc lk : in s t d l o g i c := ’X’ ;
121 a c l r : in s t d l o g i c := ’ 0 ’ ;
122 q : out s t d l o g i c v e c t o r ( lpm width − 1
downto 0) ;
123 rdempty : out s t d l o g i c ;
124 wr f u l l : out s t d l o g i c ;
120
125 r d f u l l : out s t d l o g i c ;
126 wrempty : out s t d l o g i c ;
127 rdusedw : out s t d l o g i c v e c t o r ( lpm widthu − 1
downto 0) ;
128 wrusedw : out s t d l o g i c v e c t o r ( lpm widthu − 1
downto 0) ;
129 e c c s t a tu s : out s t d l o g i c v e c t o r (1 downto 0) ) ;
130 end component d u a l c l o c k f i f o ;
131 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
132 −− Constant De f i n i t i o n s
133 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
134 constant BURST LENGTH : natura l := 32 ;
135 constant BURST LENGTH SIZE : natura l := 6 ;
136 constant BUFFERDEPTH : natura l := 1024 ;
137 constant READDATA SIZE : natura l := DRAM DATA SIZE;
138 constant TOTAL BURSTS : natura l := natura l ( trunc (
r e a l ( (NUMBER OF PIXELS∗NUMBER OF SPECTRAL BINS) /
BURST LENGTH) ) ) ;
139 constant BYTES PERWORD : natura l := natura l ( trunc (
r e a l (READDATA SIZE) / r e a l (8 ) ) ) ;
140 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
141 −− Type De f i n i t i o n s
142 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
143 −− s t a t e machine s t a t e s
144 type r e ad s t a t e s T i s ( i d l e ,
145 f i f o w a i t ,
146 mid burst ,
147 f i n i s h r e a d s ) ;
148 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
149 −− S igna l Dec la ra t i ons
150 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
151 −− f i f o s i g n a l s
152 signal bu f f e r w r i t e : s t d l o g i c ;
153 signal b u f f e r f u l l : s t d l o g i c ;
154 signal bu f f e r words : s t d l o g i c v e c t o r (9 downto 0) ;
155
156 signal r e ad s t a t e : r e ad s t a t e s T ;
157
158 −− ex t ra read master s i g n a l s
159 −− the current read address
160 signal r ead addre s s : s t d l o g i c v e c t o r (31 downto 0) ;
161 −− t r a c k s the number o f b u r s t s completed
162 signal burs t s comple ted : s t d l o g i c v e c t o r ( natura l ( trunc ( log2
( r e a l (TOTAL BURSTS) ) ) ) downto 0) ;
163 −− t r a c k s the a v a i l a b l e room in the f i f o
164 signal r o om i n f i f o : s t d l o g i c v e c t o r (10 downto 0) ;
121
165 −− t r a c k s the number o f t r an sa c t i on s t ha t are wa i t ing to be
re turned
166 signal pending reads : s t d l o g i c v e c t o r (10 downto 0) ;
167
168 −− ex t ra wr i t e master s i g n a l s
169 −− the current wr i t e address
170 signal wr i t e add r e s s : s t d l o g i c v e c t o r (31 downto 0) ;
171 −− t r ack number o f va l u e s wr i t t en
172 signal counter : i n t e g e r range 0 to TOTAL BURSTS∗
BURST LENGTH+1;
173 −− DEBUG: a l e r t read FSM when wr i t i n g complete
174 signal counter check : s t d l o g i c ;
175 signal s t a r t add r e s s 1 : s t d l o g i c v e c t o r (31 downto 0) := x"
00000000" ;
176
177 begin
178 av s c s r wa i t r e qu e s t <= avm wr i te maste r wa i t reques t ;
179 wr i t e c l k <= memory clk ;
180 avm wri te master address <= av s c s r add r e s s ;
181 avm wr i te master wr i te <= av s c s r w r i t e ;
182 avm wri te master wr i tedata <= av s c s r w r i t e d a t a ;
183
184 i d c f i f o b u f f e r : component d u a l c l o c k f i f o
185 generic map(
186 lpm numwords => BUFFER DEPTH,
187 lpm width => DRAM DATA SIZE,
188 lpm widthu => 10 ,
189 rd sync de layp ipe => 4 ,
190 under f l ow check ing => "OFF" ,
191 wrsync de layp ipe => 4
192 )
193 port map(
194 data => avm read master readdata ,
195 wrreq => bu f f e r w r i t e ,
196 rdreq => bu f f e r r e ad en ,
197 wrclk => memory clk ,
198 rdc lk => system clk ,
199 q => bu f f e r r eaddata ,
200 rdempty => buffer empty ,
201 wr f u l l => b u f f e r f u l l ,
202 a c l r => open ,
203 e c c s t a tu s => open ,
204 r d f u l l => open ,
205 rdusedw => open ,
206 wrempty => open ,
207 wrusedw => bu f f e r words
208 ) ;
122
209 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
210 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
211 −−READ FSM 1
212 −− read l i g h t /dark matrix va l u e s −− addres se s x ”00000000” to
x”0FFFFFFF”
213 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
214 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
215 read FSM 1 : process (memory clk , r s t n )
216 begin
217 i f ( r s t n = ’0 ’ or r e a d s t a r t = ’0 ’ ) then
218 r e ad s t a t e <= i d l e ;
219 r ead addre s s <= s t a r t add r e s s 1 ;
220 burs t s comple ted <= ( others => ’ 0 ’ ) ;
221 pending reads <= ( others => ’ 0 ’ ) ;
222 e l s i f ( r i s i n g e d g e (memory clk ) ) then
223
224 −− DEFAULT SECTION
225 −− decrement the pending reads counter i f data i s re turned
226 i f ( avm read master readdatava l id = ’1 ’ ) then
227 pending reads <= s t d l o g i c v e c t o r ( unsigned ( pend ing reads )
− 1) ;
228 end i f ;
229
230 case r e ad s t a t e i s
231 −− IDLE
232 −− When i d l e j u s t s i t and wai t f o r the go f l a g .
233 −− Only s t a r t i f the wr i t e s t a t e machine i s i d l e as i t
may
234 −− be f i n i s h i n g a prev ious data t r an s f e r .
235 −− S ta r t the machine by moving to the f i f o w a i t s t a t e and
236 −− i n i t i a l i s i n g address and counters .
237 when i d l e =>
238 −− i f r e a d s t a r t = ’1 ’ then
239 r e ad s t a t e <= f i f o w a i t ;
240 r ead addre s s <= s t a r t add r e s s 1 ;
241 pending reads <= ( others => ’ 0 ’ ) ;
242 burs t s comple ted <= ( others => ’ 0 ’ ) ;
243 −−end i f ;
244
245 −− FIFO WAIT
246 −− When in t h i s s t a t e wai t f o r the f i f o to have
s u f f i c i e n t
247 −− space f o r a complete bu r s t . I f so , s t a r t a
bu r s t by
248 −− moving to the mid burs t s t a t e . When moving to
mid bu r s t
123
249 −− add the bu r s t va lue to the pending reads
counter .
250 when f i f o w a i t =>
251 −− check t ha t f i f o has enough space f o r 32 word bu r s t
252 i f ( unsigned ( r o om i n f i f o ) >= BURST LENGTH + 5) then
253 r e ad s t a t e <= mid burst ;
254 −− add 32 to the pending reads counter but be
255 −− mindfu l t h a t a word may be re turned at
the same
256 −− t ime
257 i f ( avm read master readdatava l id = ’0 ’ ) then
258 pending reads <= s t d l o g i c v e c t o r ( unsigned (
pend ing reads ) + BURST LENGTH) ;
259 else
260 pending reads <= s t d l o g i c v e c t o r ( unsigned (
pend ing reads ) + BURST LENGTH−1) ;
261 end i f ;
262
263 end i f ;
264
265 −− MID BURST
266 −− Count bu r s t s
267 −− I f a l l b u r s t s complete go to f i n i s h r e a d s s t a t e .
268 −− Otherwise s tay in t h i s s t a t e i f t h e r e i s room in f i f o
or
269 −− re turn to f i f o w a i t i f not . As each bu r s t i s
completed
270 −− increment address , b u r s t s completed counter
and pending
271 −− reads counter . Be mindfu l to do noth ing i f
wa i t r e que s t
272 −− i s a c t i v e
273 when mid burst =>
274 −− i f wa i t r e que s t i s a c t i v e do nothing , o the rw i s e . . .
275 i f ( avm read master wai t request /= ’1 ’ ) then
276 i f ( burs t s comple ted = s t d l o g i c v e c t o r ( to uns igned (
TOTAL BURSTS − 1 , natura l ( trunc ( log2 ( r e a l (
TOTAL BURSTS) ) ) )+1) ) ) then
277 r e ad s t a t e <= f i n i s h r e a d s ;
278 −− no need to check f o r pending reads complete
279 −− as we ’ ve j u s t r e que s t ed another 32
words
280 else
281 burs t s comple ted <= s t d l o g i c v e c t o r ( unsigned (
burs t s comple ted ) + 1) ;
282 r ead addre s s <= s t d l o g i c v e c t o r ( unsigned (
r ead addre s s ) + BURST LENGTH∗BYTES PERWORD) ;
124
283 i f ( unsigned ( r o om i n f i f o ) >= BURST LENGTH + 5)
284 then
285 r e ad s t a t e <= mid burst ;
286 −− add 32 to the pending reads counter but
287 −− be mindfu l t h a t a word may be
re turned
288 −− at the same time
289 i f ( avm read master readdatava l id = ’0 ’ ) then
290 pending reads <= s t d l o g i c v e c t o r ( unsigned (
pend ing reads ) + BURST LENGTH) ;
291 else
292 pending reads <= s t d l o g i c v e c t o r ( unsigned (
pend ing reads ) + BURST LENGTH − 1) ;
293 end i f ;
294 else
295 r e ad s t a t e <= f i f o w a i t ;
296 end i f ;
297 end i f ;
298
299 end i f ;
300
301 −− FINISH READS
302 −− Al l the read address phases are complete but t h e r e
w i l l
303 −− be readdata pending . Jus t s i t and wai t u n t i l
t h e r e i s no
304 −− readdata pending and then move to i d l e s t a t e .
Note t ha t
305 −− the pend ing reads counter i s decremented in
the d e f a u l t
306 −− s e c t i on above .
307 when f i n i s h r e a d s =>
308 i f ( avm read master readdatava l id = ’1 ’ ) then
309 i f ( unsigned ( pend ing reads ) = 1) then
310 r e ad s t a t e <= i d l e ;
311 end i f ;
312 end i f ;
313
314 end case ;
315 end i f ;
316 end process ;
317
318 avm read master read <= ’1 ’ when r e ad s t a t e = mid burst else
’ 0 ’ ;
319
125
320 r o om i n f i f o <= s t d l o g i c v e c t o r ( r e s i z e ( ( to uns igned (
BUFFER DEPTH, natura l ( trunc ( log2 ( r e a l (BUFFERDEPTH) ) ) ) + 1)
− unsigned ( bu f f e r words ) − unsigned ( pend ing reads ) ) , 11) ) ;
321
322 avm read master address <= read addre s s ;
323
324 −− s imply wr i t e data in t o the f i f o as i t comes in ( read
a s s e r t e d and
325 −− wa i t r e que s t not a c t i v e )
326 bu f f e r w r i t e <= avm read master readdatava l id ;
327
328 avm read master burstcount <= s t d l o g i c v e c t o r ( to uns igned (
BURST LENGTH, BURST LENGTH SIZE) ) ;
329
330 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
331 −− DEBUG sec t i on
332 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
333 −− −− Writes counter va l u e s f o r t e s t i n g purposes .
334 −− write FSM : process (memory clk , r s t n )
335 −− beg in
336 −− i f ( r s t n = ’0 ’ ) then
337 −−wr i t e add r e s s <= s t a r t a d d r e s s 1 ;
338 −−counter check <= ’0 ’ ;
339 −−counter <= 0;
340 −−avm wr i t e mas te r wr i t e <= ’1 ’ ;
341 −− e l s i f ( r i s i n g e d g e (memory clk ) ) then
342 −− i f ( avm wr i t e mas t e r wa i t r e que s t /= ’1 ’ ) then
343
344
345 −− i f ( counter = TOTAL BURSTS∗BURST LENGTH+1) then
346 −− avm wr i t e mas te r wr i t e <= ’0 ’ ;
347 −− counter check <= ’1 ’ ;
348 −− e l s e
349 −− avm wr i t e mas t e r wr i t eda ta <=
s t d l o g i c v e c t o r (
350 −− t o uns i gned ( counter ,
READDATA SIZE) ) ;
351 −− counter <= counter + 1;
352 −− wr i t e add r e s s <= s t d l o g i c v e c t o r ( unsigned
(
353 −− wr i t e add r e s s ) +
BYTES PERWORD) ;
354 −− end i f ;
355 −− end i f ;
356 −− end i f ;
357 −− end process ;
358
126
359 −− a v s c s r wa i t r e q u e s t <= avm wr i t e mas t e r wa i t r e que s t ;
360
361 end architecture ;
1 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 −−
3 −−! @ f i l e x c v r co r e . vhd
4 −−! @br ie f Transmission i n t e r f a c e
5 −−! @de t a i l s Contains t r an s c e i v e r phy and s e r i a l l i t e core f o r
6 −−! t ransmiss ion over t r an s c e i v e r s
7 −−! @author Monica Whitaker
8 −−! @date August 2016
9 −−! @copyright Copyright (C) 2016 Ross K. Snider and
10 −−! Monica Whitaker
11 −−
12 −− This program i s f r e e so f tware : you can r e d i s t r i b u t e i t and/or
13 −− modify i t under the terms o f the GNU General Pub l i c License
14 −− as pub l i s h ed by the Free Sof tware Foundation , e i t h e r ve r s i on
15 −− 3 o f the License , or ( at your opt ion ) any l a t e r ve r s i on .
16 −−
17 −− This program i s d i s t r i b u t e d in the hope t ha t i t w i l l be
18 −− use fu l , but WITHOUT ANY WARRANTY; wi thout even the imp l i ed
19 −− warranty o f MERCHANTABILITY or FITNESS FOR A PARTICULAR
20 −− PURPOSE. See the GNU General Pub l i c License f o r more d e t a i l s .
21 −−
22 −− You shou ld have r e c e i v ed a copy o f the GNU General Pub l i c
23 −− License a long wi th t h i s program . I f not , see <h t t p ://www. gnu
. org / l i c e n s e s />.
24 −−
25 −− Monica Whitaker
26 −− E l e c t r i c a l and Computer Engineer ing
27 −− Montana S ta t e Un i v e r s i t y
28 −− 610 Cob le i gh Ha l l
29 −− Bozeman , MT 59717
30 −− monica . whitaker@msu . montana . edu
31 −−
32 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
33 l ibrary IEEE ; −−! Use standard l i b r a r y .
34 use IEEE . STD LOGIC 1164 .ALL; −−! Use standard l o g i c e lements .
35 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
36 −−
37 −−! @br ie f xcvr core
38 −−! @de t a i l s Contains t r an s c e i v e r phy and s e r i a l l i t e core f o r
39 −−! t ransmiss ion over t r an s c e i v e r s
40 −−! @param clk 100MHz Input c l k f o r phy
management
127
41 −−! @param x c v r r e f c l k Transce iver p l l r e f e r ence
c l o c k
42 −−! @param c l k da t a Clock f o r A t l an t i c
i n t e r f a c e
43 −−! @param r e s e t Act ive h igh r e s e t
44 −−! @param re s e t n Act ive low r e s e t
45 −−! @param r x s e r i a l d a t a S e r i a l r e c e i v e r i n t e r f a c e
46 −−! @param t x s e r i a l d a t a S e r i a l t ransmiss ion
i n t e r f a c e
47 −−! @param tx r eady Ready s i g n a l f o r
t ransmiss ion
48 −−! @param rx ready Ready s i g n a l f o r r e c e i v e r
49 −−! @param s t a t r r l i n k Ind i c a t e s l i n k i s up
50 −−! @param tda t Data to t ransmi t
51 −−! @param tdav Data a v a i l a b l e
52 −−! @param tena Enable t ransmiss ion
53 −−! @param tsop Transmit s t a r t o f packe t
54 −−! @param teop Transmit end o f packe t
55 −−! @param t e r r Error in t ransmi t data
56 −−! @param tmty Number o f empty by t e s in
57 −−! t ransmi t data
58 −−! @param taddr Address o f packe t to send
59 −−! @param rdav Data a v a i l a b l e
60 −−! @param r va l Data v a l i d
61 −−! @param rdat Incoming data
62 −−! @param rsop Receiver s t a r t o f packe t
s i g n a l
63 −−! @param reop Receiver end o f packe t
s i g n a l
64 −−! @param rer r Receive error
65 −−! @param rmty Number o f empty b y t e s in
66 −−! r e c e i v ed data
67 −−! @param raddr Address o f packe t
r e c e i v ed
68 −−! @param e r r r r c r c CRC error found
69 −−! @param r e c o n f i g r e s e t Reset f o r r e c on f i g u r a t i on
i n t e r f a c e
70 −−! @param re con f i g r e ad Read r e que s t
71 −−! @param r e c on f i g w r i t e Write r e que s t
72 −−! @param re con f i g a dd r e s s Recon f i gura t ion address
73 −−! @param r e c on f i g w r i t e d a t a Data to wr i t e on
74 −−! r e c on f i g u r a t i on i n t e r f a c e
75 −−! @param r e c on f i g wa i t r e q u e s t Waitrequest from
76 −−! r e c on f i g u r a t i on i n t e r f a c e
77 −−! @param recon f i g r e adda t a Data read from
78 −−! r e c on f i g u r a t i on i n t e r f a c e
79 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
128
80 entity xcv r co r e i s
81 generic (
82 NUMBER OF LANES : natura l := 1 ;
83 LANEWIDTH : natura l := 32
84 ) ;
85 port (
86 clk 50MHz : in s t d l o g i c ;
87 x c v r r e f c l k : in s t d l o g i c ;
88 c lkdata : in s t d l o g i c ;
89 r e s e t : in s t d l o g i c ;
90 r e s e t n : in s t d l o g i c ;
91 r x s e r i a l d a t a : in s t d l o g i c ;
92 t x s e r i a l d a t a : out s t d l o g i c ;
93
94 tx ready : out s t d l o g i c ;
95 rx ready : out s t d l o g i c ;
96
97 s t a t r r l i n k : out s t d l o g i c ;
98
99 tdat : in s t d l o g i c v e c t o r ( ( (
NUMBER OF LANES ∗ LANEWIDTH)−1) downto 0) ;
100 tdav : out s t d l o g i c ;
101 tena : in s t d l o g i c ;
102 tsop : in s t d l o g i c ;
103 teop : in s t d l o g i c ;
104 t e r r : in s t d l o g i c ;
105 tmty : in s t d l o g i c v e c t o r (1 downto 0)
;
106 taddr : in s t d l o g i c v e c t o r (7 downto 0)
;
107
108 rdat : out s t d l o g i c v e c t o r ( ( (
NUMBER OF LANES ∗ LANEWIDTH)−1) downto 0) ;
109 rdav : out s t d l o g i c ;
110 r va l : out s t d l o g i c ;
111 rena : in s t d l o g i c ;
112 rsop : out s t d l o g i c ;
113 reop : out s t d l o g i c ;
114 r e r r : out s t d l o g i c ;
115 rmty : out s t d l o g i c v e c t o r (1 downto
0) ;
116 raddr : out s t d l o g i c v e c t o r (7 downto
0) ;
117
118 e r r c r c l o c k : out s t d l o g i c ;
119
120 r e c o n f i g r e s e t : in s t d l o g i c ;
129
121 r e c on f i g r e ad : in s t d l o g i c ;
122 r e c o n f i g w r i t e : in s t d l o g i c ;
123 r e c on f i g add r e s s : in s t d l o g i c v e c t o r (9 downto 0)
;
124 r e c on f i g w r i t e d a t a : in s t d l o g i c v e c t o r (31 downto
0) ;
125 r e c on f i g wa i t r e qu e s t : out s t d l o g i c ;
126 r e c on f i g r e adda t a : out s t d l o g i c v e c t o r (31 downto
0)
127 ) ;
128 end entity ;
129
130 architecture arch of xcv r co r e i s
131 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
132 −− Component De f i n i t i o n s
133 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
134 component a10 xcvr phy i s
135 port (
136 r x ana l o g r e s e t : in s t d l o g i c v e c t o r (0
downto 0) := ( others => ’ 0 ’ ) ;
137 r x ca l bu sy : out s t d l o g i c v e c t o r (0
downto 0) ;
138 r x c d r r e f c l k 0 : in s t d l o g i c := ’ 0 ’ ;
139 r x c l k ou t : out s t d l o g i c v e c t o r (0
downto 0) ;
140 r x c o r e c l k i n : in s t d l o g i c v e c t o r (0
downto 0) := ( others => ’ 0 ’ ) ;
141 rx datak : out s t d l o g i c v e c t o r (3
downto 0) ;
142 r x d i g i t a l r e s e t : in s t d l o g i c v e c t o r (0
downto 0) := ( others => ’ 0 ’ ) ;
143 r x d i s p e r r : out s t d l o g i c v e c t o r (3
downto 0) ;
144 r x e r r d e t e c t : out s t d l o g i c v e c t o r (3
downto 0) ;
145 r x i s l o c k e d t od a t a : out s t d l o g i c v e c t o r (0
downto 0) ;
146 r x i s l o c k e d t o r e f : out s t d l o g i c v e c t o r (0
downto 0) ;
147 r x p a r a l l e l d a t a : out s t d l o g i c v e c t o r (31
downto 0) ;
148 r x pa t t e rnde t e c t : out s t d l o g i c v e c t o r (3
downto 0) ;
149 rx runn ingd i sp : out s t d l o g i c v e c t o r (3
downto 0) ;
150 r x s e r i a l d a t a : in s t d l o g i c v e c t o r (0
downto 0) := ( others => ’ 0 ’ ) ;
130
151 r x sync s t a tu s : out s t d l o g i c v e c t o r (3
downto 0) ;
152 t x ana l o g r e s e t : in s t d l o g i c v e c t o r (0
downto 0) := ( others => ’ 0 ’ ) ;
153 t x ca l bu sy : out s t d l o g i c v e c t o r (0
downto 0) ;
154 t x c l k ou t : out s t d l o g i c v e c t o r (0
downto 0) ;
155 t x c o r e c l k i n : in s t d l o g i c v e c t o r (0
downto 0) := ( others => ’ 0 ’ ) ;
156 tx datak : in s t d l o g i c v e c t o r (3
downto 0) := ( others => ’ 0 ’ ) ;
157 t x d i g i t a l r e s e t : in s t d l o g i c v e c t o r (0
downto 0) := ( others => ’ 0 ’ ) ;
158 t x p a r a l l e l d a t a : in s t d l o g i c v e c t o r (31
downto 0) := ( others => ’ 0 ’ ) ;
159 t x s e r i a l c l k 0 : in s t d l o g i c v e c t o r (0
downto 0) := ( others => ’ 0 ’ ) ;
160 t x s e r i a l d a t a : out s t d l o g i c v e c t o r (0
downto 0) ;
161 unu s ed r x pa r a l l e l d a t a : out s t d l o g i c v e c t o r (71
downto 0) ;
162 unu s ed t x pa r a l l e l d a t a : in s t d l o g i c v e c t o r (91
downto 0) := ( others => ’ 0 ’ )
163 ) ;
164 end component a10 xcvr phy ;
165
166 component s l 2 c o r e IS
167 port (
168 r x p a r a l l e l d a t a o u t : in s t d l o g i c v e c t o r (31
downto 0) ;
169 r x c o r e c l k : in s t d l o g i c ;
170 r x c t r l d e t e c t : in s t d l o g i c v e c t o r (3
downto 0) ;
171 s t a t r r p a t t d e t : in s t d l o g i c v e c t o r (3
downto 0) ;
172 e r r r r d i s p : in s t d l o g i c v e c t o r (3
downto 0) ;
173 t x c o r e c l k : in s t d l o g i c ;
174 c t r l t c f o r c e t r a i n : in s t d l o g i c ;
175 mreset n : in s t d l o g i c ;
176 rx rdp c l k : in s t d l o g i c ;
177 rxrdp ena : in s t d l o g i c ;
178 −− r e c e i v e FIFO th r e s ho l d low − un i t s in e lements
179 c t l r x r d p f t l : in s t d l o g i c v e c t o r (7
downto 0) ;
180 c t l rx rdp eopdav : in s t d l o g i c ;
131
181 tx rdp c l k : in s t d l o g i c ;
182 txrdp ena : in s t d l o g i c ;
183 txrdp sop : in s t d l o g i c ;
184 txrdp eop : in s t d l o g i c ;
185 t x rdp e r r : in s t d l o g i c ;
186 txrdp mty : in s t d l o g i c v e c t o r (1
downto 0) ;
187 txrdp dat : in s t d l o g i c v e c t o r (31
downto 0) ;
188 txrdp adr : in s t d l o g i c v e c t o r (7
downto 0) ;
189 −− t ransmi t FIFO bu f f e r t h r e s h o l d h igh
190 c t l t x r d p f t h : in s t d l o g i c v e c t o r (7
downto 0) ;
191 f l i p p o l a r i t y : out s t d l o g i c ;
192 r r e f c l k : out s t d l o g i c ;
193 s t a t r r l i n k : out s t d l o g i c ;
194 e r r r r 8 b e r r d e t : in s t d l o g i c v e c t o r (3
downto 0) ;
195 t x p a r a l l e l d a t a i n : out s t d l o g i c v e c t o r (31
downto 0) ;
196 t x c t r l e n a b l e : out s t d l o g i c v e c t o r (3
downto 0) ;
197 t x c o r e c l o c k : out s t d l o g i c ;
198 rxrdp sop : out s t d l o g i c ;
199 rxrdp eop : out s t d l o g i c ;
200 r x rdp e r r : out s t d l o g i c ;
201 rxrdp mty : out s t d l o g i c v e c t o r (1
downto 0) ;
202 rxrdp dat : out s t d l o g i c v e c t o r (31
downto 0) ;
203 rxrdp adr : out s t d l o g i c v e c t o r (7
downto 0) ;
204 rx rdp va l : out s t d l o g i c ;
205 rxrdp dav : out s t d l o g i c ;
206 −− At l an t i c FIFO bu f f e r i s empty
207 s tat rxrdp empty : out s t d l o g i c ;
208 −− At l an t i c FIFO bu f f e r ove r f l ow and data l o s t
209 e r r t c r x r dp o f lw : out s t d l o g i c ;
210 −− At l an t i c FIFO bu f f e r ove r f l ow and data l o s t
211 e r r t x r dp o f lw : out s t d l o g i c ;
212 txrdp dav : out s t d l o g i c ;
213 −− f r equency o f f s e t t o l e r anc e FIFO bu f f e r ove r f l ow
214 −− l i n k r e s t a r t s
215 e r r r r f o f f r e o f l w : out s t d l o g i c ;
216 −− f r equency o f f s e t t o l e r anc e FIFO bu f f e r under f low
217 s t a t t c f o f f r e emp t y : out s t d l o g i c ;
132
218 −− end o f bad packe t charac t e r r e c e i v ed
219 s t a t r r e bp r x : out s t d l o g i c ;
220 −− BIP−8 error d e t e c t e d in l i n k management packe t
221 e r r r r b i p 8 : out s t d l o g i c ;
222 −− CRC error de t e c t e d
223 e r r r r c r c : out s t d l o g i c ;
224 e r r r r f c r x b n e : out s t d l o g i c ;
225 e r r r r r o e r x bn e : out s t d l o g i c ;
226 −− i n v a l i d l i n k management packe t r e c e i v ed
227 e r r r r i n v a l i d lmp r x : out s t d l o g i c ;
228 −− s t a r t o f data con t r o l word miss ing
229 e r r r r m i s s i n g s t a r t d cw : out s t d l o g i c ;
230 −− s t a r t and end address f i e l d s do not match
231 e r r r r addr mismatch : out s t d l o g i c ;
232 −− p o s s i b l e c a t a s t r o ph i c error
233 e r r r r p o l r e v r e q u i r e d : out s t d l o g i c
234 ) ;
235 end component ;
236
237 component d u a l c l o c k f i f o i s
238 generic (
239 enab l e e c c : s t r i n g := "FALSE" ;
240 i n t ended dev i c e f am i l y : s t r i n g := "Arria 10" ;
241 lpm hint : s t r i n g
242 := "
DISABLE_DCFIFO_EMBEDDED_TIMING_CONSTRAINT
=TRUE" ;
243 lpm numwords : natura l ;
244 lpm showahead : s t r i n g := "OFF" ;
245 lpm type : s t r i n g := "dcfifo" ;
246 lpm width : natura l ;
247 lpm widthu : natura l ;
248 ove r f l ow check ing : s t r i n g := "ON" ;
249 rd sync de layp ipe : natura l ;
250 under f l ow check ing : s t r i n g := "ON" ;
251 use eab : s t r i n g := "ON" ;
252 wrsync de layp ipe : natura l
253 ) ;
254 port (
255 data : in s t d l o g i c v e c t o r ( lpm width − 1
downto 0) := ( others => ’X’ ) ;
256 wrreq : in s t d l o g i c := ’X’ ;
257 rdreq : in s t d l o g i c := ’X’ ;
258 wrclk : in s t d l o g i c := ’X’ ;
259 rdc lk : in s t d l o g i c := ’X’ ;
260 a c l r : in s t d l o g i c := ’ 0 ’ ;
133
261 q : out s t d l o g i c v e c t o r ( lpm width − 1
downto 0) ;
262 rdempty : out s t d l o g i c ;
263 wr f u l l : out s t d l o g i c ;
264 r d f u l l : out s t d l o g i c ;
265 wrempty : out s t d l o g i c ;
266 rdusedw : out s t d l o g i c v e c t o r ( lpm widthu − 1
downto 0) ;
267 wrusedw : out s t d l o g i c v e c t o r ( lpm widthu − 1
downto 0) ;
268 e c c s t a tu s : out s t d l o g i c v e c t o r (1 downto 0)
269 ) ;
270 end component ;
271
272 component x c v r p l l i s
273 port (
274 p l l c a l b u s y : out s t d l o g i c ;
275 p l l l o c k e d : out s t d l o g i c ;
276 pll powerdown : in s t d l o g i c := ’ 0 ’ ;
277 p l l r e f c l k 0 : in s t d l o g i c := ’ 0 ’ ;
278 t x s e r i a l c l k : out s t d l o g i c
279 ) ;
280 end component ;
281
282 component x c v r r e s e t i s
283 port (
284 c l o ck : in s t d l o g i c := ’ 0 ’ ;
285 p l l l o c k e d : in s t d l o g i c v e c t o r (0 downto 0)
:= ( others => ’ 0 ’ ) ;
286 pll powerdown : out s t d l o g i c v e c t o r (0 downto 0)
;
287 p l l s e l e c t : in s t d l o g i c v e c t o r (0 downto 0)
:= ( others => ’ 0 ’ ) ;
288 r e s e t : in s t d l o g i c := ’ 0 ’ ;
289 r x ana l o g r e s e t : out s t d l o g i c v e c t o r (0 downto 0)
;
290 r x ca l bu sy : in s t d l o g i c v e c t o r (0 downto 0)
:= ( others => ’ 0 ’ ) ;
291 r x d i g i t a l r e s e t : out s t d l o g i c v e c t o r (0 downto 0)
;
292 r x i s l o c k e d t od a t a : in s t d l o g i c v e c t o r (0 downto 0)
:= ( others => ’ 0 ’ ) ;
293 rx ready : out s t d l o g i c v e c t o r (0 downto 0)
;
294 t x ana l o g r e s e t : out s t d l o g i c v e c t o r (0 downto 0)
;
134
295 t x ca l bu sy : in s t d l o g i c v e c t o r (0 downto 0)
:= ( others => ’ 0 ’ ) ;
296 t x d i g i t a l r e s e t : out s t d l o g i c v e c t o r (0 downto 0)
;
297 tx ready : out s t d l o g i c v e c t o r (0 downto 0)
298 ) ;
299 end component ;
300
301 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
302 −− S igna l De f i n i t i o n s
303 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
304 signal ONES : s t d l o g i c v e c t o r (
NUMBER OF LANES−1 downto 0) ;
305
306 signal r x f r e q l o c k ed : s t d l o g i c v e c t o r (
NUMBER OF LANES−1 downto 0) ;
307
308 signal c t l r x r d p f t l : s t d l o g i c v e c t o r (7 downto
0) ;
309 signal c t l t x r d p f t h : s t d l o g i c v e c t o r (7 downto
0) ;
310 signal s t a t r r l i n k m i n 2 : s t d l o g i c ;
311 signal s t a t r r l i n k m i n 1 : s t d l o g i c ;
312
313 signal s tat rxrdp empty : s t d l o g i c ;
314 signal e r r t c r x r dp o f lw : s t d l o g i c ;
315 signal e r r t x r dp o f lw : s t d l o g i c ;
316 signal e r r r r f o f f r e o f l w : s t d l o g i c ;
317 signal s t a t t c f o f f r e emp t y : s t d l o g i c ;
318 signal s t a t r r e bp r x : s t d l o g i c ;
319 signal e r r r r b i p 8 : s t d l o g i c ;
320 signal e r r r r f c r x b n e : s t d l o g i c ;
321 signal e r r r r r o e r x bn e : s t d l o g i c ;
322 signal e r r r r i n v a l i d lmp r x : s t d l o g i c ;
323 signal e r r r r m i s s i n g s t a r t d cw : s t d l o g i c ;
324 signal e r r r r addr mismatch : s t d l o g i c ;
325 signal e r r r r c r c : s t d l o g i c ;
326
327 signal r x p a r a l l e l d a t a : s t d l o g i c v e c t o r ( (
NUMBER OF LANES ∗ LANEWIDTH)−1 downto 0) ;
328 signal t x p a r a l l e l d a t a : s t d l o g i c v e c t o r ( (
NUMBER OF LANES ∗ LANEWIDTH)−1 downto 0) ;
329 signal tx datak : s t d l o g i c v e c t o r (3 downto
0) ;
330 signal rx datak : s t d l o g i c v e c t o r (3 downto
0) ;
331
135
332 signal r x c o r e c l k : s t d l o g i c v e c t o r (
NUMBER OF LANES − 1 downto 0) ;
333 signal t x c o r e c l k : s t d l o g i c v e c t o r (
NUMBER OF LANES − 1 downto 0) ;
334 signal r x c l k ou t : s t d l o g i c v e c t o r (
NUMBER OF LANES − 1 downto 0) ;
335 signal t x c l k ou t : s t d l o g i c v e c t o r (
NUMBER OF LANES − 1 downto 0) ;
336 signal t x c o r e c l o c k : s t d l o g i c ;
337 signal r r e f c l k : s t d l o g i c ;
338
339 signal r x d i s p e r r : s t d l o g i c v e c t o r (3 downto
0) ;
340 signal r x e r r d e t e c t : s t d l o g i c v e c t o r (3 downto
0) ;
341 signal r x pa t t e rnde t e c t : s t d l o g i c v e c t o r (3 downto
0) ;
342
343 signal tx ca l busy combined : s t d l o g i c v e c t o r (0 downto
0) ;
344 signal t x s e r i a l c l k p l l : s t d l o g i c ;
345 signal pll powerdown : s t d l o g i c ;
346 signal p l l c a l b u s y : s t d l o g i c ;
347 signal p l l l o c k e d : s t d l o g i c ;
348 signal t x s e r i a l c l k : s t d l o g i c v e c t o r (
NUMBER OF LANES−1 downto 0) ;
349
350 signal t x ca l bu sy : s t d l o g i c v e c t o r (0 downto
0) ;
351 signal t x r e ady i : s t d l o g i c v e c t o r (0 downto
0) ;
352 signal r x ca l bu sy : s t d l o g i c v e c t o r (0 downto
0) ;
353 signal r x r e ady i : s t d l o g i c v e c t o r (0 downto
0) ;
354 signal r x a n a l o g r e s e t i : s t d l o g i c v e c t o r (0 downto
0) ;
355 signal r x d i g i t a l r e s e t i : s t d l o g i c v e c t o r (0 downto
0) ;
356 signal t x a n a l o g r e s e t i : s t d l o g i c v e c t o r (0 downto
0) ;
357 signal t x d i g i t a l r e s e t i : s t d l o g i c v e c t o r (0 downto
0) ;
358
359
360 signal w req : s t d l o g i c ;
361 signal r r e q : s t d l o g i c ;
136
362 signal w fu l l : s t d l o g i c ;
363 signal r empty : s t d l o g i c ;
364 signal e r r 8 b l o c k : s t d l o g i c ;
365 signal e r r addr mismatch lock : s t d l o g i c ;
366 signal e r r b i p 8 l o c k : s t d l o g i c ;
367 signal e r r i n v a l i d lmp r x l o c k : s t d l o g i c ;
368 signal e r r m i s s i n g l o c k : s t d l o g i c ;
369 signal e r r a r r a y : s t d l o g i c v e c t o r (4 downto
0) ;
370
371 begin
372
373 generate ALTGX clocks :
374 for i in 0 to NUMBER OF LANES−1 generate
375 r x c o r e c l k ( i ) <= rx c l kou t (0 ) ;
376 t x c o r e c l k ( i ) <= tx c l kou t (0 ) ;
377 tx ca l busy combined ( i ) <= tx ca l bu sy ( i ) or p l l c a l b u s y
;
378 end generate ;
379
380 g e n e r a t e x c v r s e r i a l c l o c k s 1 :
381 for i in 0 to NUMBER OF LANES−1 generate
382 t x s e r i a l c l k ( i ) <= t x s e r i a l c l k p l l ;
383 end generate ;
384
385 u0 : component a10 xcvr phy
386 port map(
387 r x ana l o g r e s e t => r x an a l o g r e s e t i ,
388 r x ca l bu sy => rx ca l busy ,
389 r x c d r r e f c l k 0 => x cv r r e f c l k ,
390 r x c l k ou t => rx c lkout ,
391 r x c o r e c l k i n => r x co r e c l k ,
392 rx datak => rx datak ,
393 r x d i g i t a l r e s e t => r x d i g i t a l r e s e t i ,
394 r x d i s p e r r => r x d i sp e r r ,
395 r x e r r d e t e c t => r x e r r d e t e c t ,
396 r x i s l o c k e d t od a t a => r x f r eq l o ck ed ,
397 r x i s l o c k e d t o r e f => open ,
398 r x p a r a l l e l d a t a => r x p a r a l l e l d a t a ,
399 rx runn ingd i sp => open ,
400 r x pa t t e rnde t e c t => rx pa t t e rnde t e c t ,
401 r x s e r i a l d a t a (0 ) => r x s e r i a l d a t a ,
402 r x sync s t a tu s => open ,
403 t x ana l o g r e s e t => t x an a l o g r e s e t i ,
404 t x ca l bu sy => tx ca l busy ,
405 t x c l k ou t => tx c lkout ,
406 t x c o r e c l k i n => t x co r e c l k ,
137
407 tx datak => tx datak ,
408 t x d i g i t a l r e s e t => t x d i g i t a l r e s e t i ,
409 t x p a r a l l e l d a t a => t x p a r a l l e l d a t a ,
410 t x s e r i a l c l k 0 => t x s e r i a l c l k ,
411 t x s e r i a l d a t a (0 ) => t x s e r i a l d a t a ,
412 unu s ed r x pa r a l l e l d a t a => open ,
413 unu s ed t x pa r a l l e l d a t a => ( others => ’ 0 ’ )
414 ) ;
415
416 u1 : s l 2 c o r e
417 port map(
418 r x p a r a l l e l d a t a o u t => r x p a r a l l e l d a t a ,
419 r x c o r e c l k => r x c o r e c l k (0 ) ,
420 r x c t r l d e t e c t => rx datak ,
421 s t a t r r p a t t d e t => rx pa t t e rnde t e c t ,
422 e r r r r d i s p => r x d i sp e r r ,
423 t x c o r e c l k => t x c o r e c l k (0 ) ,
424 c t r l t c f o r c e t r a i n => ’ 0 ’ ,
425 mreset n => r e s e t n ,
426 rx rdp c l k => c lkdata ,
427 rxrdp ena => rena ,
428 c t l r x r d p f t l => c t l r x r d p f t l ,
429 c t l rx rdp eopdav => ’ 0 ’ ,
430 tx rdp c l k => c lkdata ,
431 txrdp ena => tena ,
432 txrdp sop => tsop ,
433 txrdp eop => teop ,
434 t x rdp e r r => t e r r ,
435 txrdp mty => tmty ,
436 txrdp dat => tdat ,
437 txrdp adr => taddr ,
438 c t l t x r d p f t h => c t l t x r dp f t h ,
439 f l i p p o l a r i t y => open ,
440 r r e f c l k => r r e f c l k ,
441 s t a t r r l i n k => s t a t r r l i n k m in2 ,
442 e r r r r 8 b e r r d e t => r x e r r d e t e c t ,
443 t x p a r a l l e l d a t a i n => t x p a r a l l e l d a t a ,
444 t x c t r l e n a b l e => tx datak ,
445 t x c o r e c l o c k => t x co r e c l o ck ,
446 rxrdp sop => rsop ,
447 rxrdp eop => reop ,
448 r x rdp e r r => r e r r ,
449 rxrdp mty => rmty ,
450 rxrdp dat => rdat ,
451 rxrdp adr => raddr ,
452 rx rdp va l => rva l ,
453 rxrdp dav => rdav ,
138
454 s tat rxrdp empty => stat rxrdp empty ,
455 e r r t c r x r dp o f lw => e r r t c r x r dp o f lw ,
456 e r r t x r dp o f lw => e r r t x rdp o f lw ,
457 txrdp dav => tdav ,
458 e r r r r f o f f r e o f l w => e r r r r f o f f r e o f l w ,
459 s t a t t c f o f f r e emp t y => s t a t t c f o f f r e emp t y ,
460 s t a t r r e bp r x => s t a t r r ebp rx ,
461 e r r r r b i p 8 => e r r r r b i p 8 ,
462 e r r r r c r c => e r r r r c r c ,
463 e r r r r f c r x b n e => e r r r r f c r x bn e ,
464 e r r r r r o e r x bn e => e r r r r r o e r x bn e ,
465 e r r r r i n v a l i d lmp r x => e r r r r i n v a l i d lmp r x ,
466 e r r r r m i s s i n g s t a r t d cw => e r r r r m i s s i n g s t a r t d cw ,
467 e r r r r addr mismatch => er r r r addr mismatch ,
468 e r r r r p o l r e v r e q u i r e d => open
469 ) ;
470
471 u2 : x c v r p l l
472 port map(
473 p l l c a l b u s y => p l l c a l bu s y ,
474 p l l l o c k e d => p l l l o c k ed ,
475 pll powerdown => pll powerdown ,
476 p l l r e f c l k 0 => x cv r r e f c l k ,
477 t x s e r i a l c l k => t x s e r i a l c l k p l l
478 ) ;
479
480 u3 : x c v r r e s e t
481 port map(
482 c l o ck => clk 50MHz ,
483 p l l l o c k e d (0 ) => p l l l o c k ed ,
484 pll powerdown (0) => pll powerdown ,
485 p l l s e l e c t => ( others => ’ 0 ’ ) ,
486 r e s e t => r e s e t ,
487 r x ana l o g r e s e t => r x an a l o g r e s e t i ,
488 r x ca l bu sy => rx ca l busy ,
489 r x d i g i t a l r e s e t => r x d i g i t a l r e s e t i ,
490 r x i s l o c k e d t od a t a => r x f r eq l o ck ed ,
491 rx ready => r x r eady i ,
492 t x ana l o g r e s e t => t x an a l o g r e s e t i ,
493 t x ca l bu sy => tx ca l busy combined ,
494 t x d i g i t a l r e s e t => t x d i g i t a l r e s e t i ,
495 tx ready => t x r e ady i
496 ) ;
497
498 f i f o l o c k : d u a l c l o c k f i f o
499 generic map(
500 lpm numwords => 32 ,
139
501 lpm width => 5 ,
502 lpm widthu => 5 ,
503 rd sync de layp ipe => 3 ,
504 wrsync de layp ipe => 3
505 )
506 port map(
507 data => e r r r r b i p 8 & e r r r r c r c &
e r r r r i n v a l i d lmp r x &
508 e r r r r m i s s i n g s t a r t d cw &
err r r addr mismatch ,
509 wrreq => w req ,
510 rdreq => r r eq ,
511 wrclk => r r e f c l k ,
512 rdc lk => c lkdata ,
513 a c l r => ’ 0 ’ ,
514 q => e r r a r ray ,
515 rdempty => r empty ,
516 wr f u l l => w fu l l ,
517 r d f u l l => open ,
518 wrempty => open
519 ) ;
520 −−Ava i l a b l e f o r f u t u r e cons i d e ra t i on
521 e r r b i p 8 l o c k <= er r a r r a y (4 ) ;
522 e r r c r c l o c k <= er r a r r a y (3 ) ;
523 e r r i n v a l i d lmp r x l o c k <= er r a r r a y (2 ) ;
524 e r r m i s s i n g l o c k <= er r a r r a y (1 ) ;
525 e r r addr mismatch lock <= er r a r r a y (0 ) ;
526
527 process ( r r e f c l k )
528 begin
529 i f ( r i s i n g e d g e ( r r e f c l k ) ) then
530 i f ( w f u l l = ’0 ’ ) then
531 w req <= ’1 ’ ;
532 else
533 w req <= ’0 ’ ;
534 end i f ;
535 end i f ;
536 end process ;
537
538 process ( c lkdata )
539 begin
540 i f ( r i s i n g e d g e ( c lkdata ) ) then
541 i f ( r empty = ’0 ’ ) then
542 r r e q <= ’1 ’ ;
543 else
544 r r e q <= ’0 ’ ;
545 end i f ;
140
546 end i f ;
547 end process ;
548
549
550 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
551 −− Generate Zeroes and Ones
552 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
553 generate ZEROES and ONES :
554 for i in 0 to NUMBER OF LANES−1 generate
555 ONES( I ) <= ’1 ’ ;
556 end generate ;
557
558 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
559 −− Generate t x r eady and rx ready
560 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
561 tx ready <= ’1 ’ when t x r e ady i = ONES else ’ 0 ’ ;
562 rx ready <= ’1 ’ when r x r e ady i = ONES else ’ 0 ’ ;
563
564 c t l r x r d p f t l <= "00010010" ; −− Set a r b i t r a r i l y ( check
s imu la t i on )
565 c t l t x r d p f t h <= "01110000" ; −− Set a r b i t r a r i l y ( check
s imu la t i on )
566
567 −−r e g i s t e r f o r l i n k s t a t u s
568 process ( clk 50MHz , r e s e t )
569 begin
570 i f ( r e s e t = ’1 ’ ) then
571 s t a t r r l i n k m i n 1 <= ’0 ’ ;
572 s t a t r r l i n k <= ’0 ’ ;
573 e l s i f ( r i s i n g e d g e ( clk 50MHz ) ) then
574 s t a t r r l i n k m i n 1 <= s t a t r r l i n k m i n 2 ;
575 s t a t r r l i n k <= s t a t r r l i n k m i n 1 ;
576 end i f ;
577 end process ;
578 end architecture ;
141
APPENDIX C
MATLAB CODE
142
1 %
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 % s o r t t b .m
3 % Testbench f o r s o r t i n g component −− s o r t in two c y c l e s
4 %
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
5 clc ;
6
7 s o r t hd l = hdlcos im top ;
8 NUM TRIALS = 50 ;
9
10 for k = 1 :NUM TRIALS
11 %Bui ld a l l the random inpu t s
12 in1 = s i n g l e (randn ( ) ) ;
13 in2 = s i n g l e (randn ( ) ) ;
14 in3 = s i n g l e (randn ( ) ) ;
15 in4 = s i n g l e (randn ( ) ) ;
16 in5 = s i n g l e (randn ( ) ) ;
17 in6 = s i n g l e (randn ( ) ) ;
18 in7 = s i n g l e (randn ( ) ) ;
19 in8 = s i n g l e (randn ( ) ) ;
20 in9 = s i n g l e (randn ( ) ) ;
21 in10 = s i n g l e (randn ( ) ) ;
22 in11 = s i n g l e (randn ( ) ) ;
23 in12 = s i n g l e (randn ( ) ) ;
24 in13 = s i n g l e (randn ( ) ) ;
25 in14 = s i n g l e (randn ( ) ) ;
26 in15 = s i n g l e (randn ( ) ) ;
27 in16 = s i n g l e (randn ( ) ) ;
28 in17 = s i n g l e (randn ( ) ) ;
29 in18 = s i n g l e (randn ( ) ) ;
30 in19 = s i n g l e (randn ( ) ) ;
31 in20 = s i n g l e (randn ( ) ) ;
32
33 i n pu t h i s t o r y {k} = [ in20 in19 in18 in17 in16 in15 in14 in13
in12 in11 in10 in9 in8 in7 in6 in5 in4 in3 in2 in1 ] ;
34
35 %input in t o system
36 [ out20 out19 out18 out17 out16 out15 out14 out13 out12 out11
out10 out9 out8 out7 out6 out5 out4 out3 out2 out1 . . .
37 ind1 ind2 ind3 ind4 ind5 ind6 ind7 ind8 ind9 ind10 ind11
ind12 ind13 ind14 ind15 ind16 ind17 ind18 ind19 ind20 ]
= . . .
143
38 s tep ( s o r t hd l , in1 , in2 , in3 , in4 , in5 , in6 , in7 , in8 , in9 , in10
, in11 , in12 , in13 , in14 , in15 , in16 , in17 , in18 , in19 , in20 )
;
39
40 ou tput h i s t o ry {k} = [ out20 out19 out18 out17 out16 out15
out14 out13 out12 out11 out10 out9 out8 out7 out6 out5
out4 out3 out2 out1 ] ;
41 ou tpu t i nd i c e s {k} = [ ind1 ind2 ind3 ind4 ind5 ind6 ind7 ind8
ind9 ind10 ind11 ind12 ind13 ind14 ind15 ind16 ind17 ind18
ind19 ind20 ] ;
42 end ;
43
44 l a t ency = 2 ;
45 for k = 1 :NUM TRIALS−l a t ency
46 o r i g i n a l = i npu t h i s t o r y {k}
47 % sor t ed = ou t p u t h i s t o r y {k+l a t ency }
48 so r t ed (k , : ) = output h i s t o ry {k+la t ency }
49 temp = outpu t i nd i c e s {k+la t ency } ;
50 % ind i c e s = doub le ( temp )
51 i n d i c e s (k , : ) = double ( temp) ;
52
53 %compute s o r t in MATLAB
54 [ a c tua l (k , : ) , a c tua l i ndex (k , : ) ] = sort ( o r i g i n a l ) ;
55
56 v a l d i f f = actual−so r t ed ;
57
58 i n d d i f f = actua l index−i n d i c e s ;
59 end ;
60
61 T = tab l e ( sorted , a c tua l ) ;
62 wr i t e t ab l e (T, ’sorted.xlsx’ ,’Range’ ,’B1’ ) ;
63 T = tab l e ( i nd i c e s , a c tua l i ndex ) ;
64 wr i t e t ab l e (T, ’indices.xlsx’ ,’Range’ ,’B1’ ) ;
65 T = tab l e ( v a l d i f f , i n d d i f f ) ;
66 wr i t e t ab l e (T, ’errors.xlsx’ ,’Range’ ,’B1’ ) ;
1 %
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 % camera ra t i o s t b .m
3 % Testbench f o r v e r i f i c a t i o n o f co r r e c t r a t i o c a l c u l a t i o n s
4 %
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
5 clc ;
6
7 s o r t hd l = hd l co s im camera r e l a t i on s ;
144
8 l i n e s = 10 ;%648;
9 hyp e r l i n e s = 160 ;
10 l i n e s c a n p i x e l s = 8000 ; %1536;
11 hype r p i x e l s = 1024 ; %200;
12 p i x e l r a t i o = l i n e s c a n p i x e l s / hype r p i x e l s ; %7.8125
13 l i n e r a t i o = l i n e s / hyp e r l i n e s ; %4.05
14
15 o f f s e t = f i ( 32 , 1 , 13 , 0 ) ;
16 p i x r a t i o = f i ( (1/ p i x e l r a t i o ) , 0 , 32 ,32 ) ;
17 l i n e r a t = f i ( (1/ l i n e r a t i o ) , 0 , 32 ,32 ) ;
18 s tep ( s o r t hd l , o f f s e t , p i x r a t i o , l i n e r a t , f i ( 0 , 0 , 32 , 0 ) , f i ( 0 , 0 , 13 , 0 )
, f i ( 0 , 0 , 13 , 0 ) ) ;
19
20 for k = 1 : l i n e s
21 for j = 0 : l i n e s c a n p i x e l s −1
22 l ine = f i (k , 0 , 3 2 , 0 ) ;
23 s t a r t = f i ( j , 0 , 1 3 , 0 ) ;
24 end pix = f i ( j , 0 , 1 3 , 0 ) ;
25
26 i n pu t h i s t o r y { j+1} = [ l ine s t a r t end pix ] ;
27 [ r e g l i n e , r e g s t a r t , r eg endp ix i gnore ] = step ( s o r t hd l ,
o f f s e t , p i x r a t i o , l i n e r a t , l ine , s t a r t , end pix ) ;
28 ou tput h i s t o ry { j+1} = [ r e g l i n e r e g s t a r t reg endp ix ] ;
29 ou tpu t f l a g { j+1} = ignore ;
30 end ;
31 end ;
32
33 e r r o r s = 0 ;
34 sim = zeros ( l i n e s c a n p i x e l s , 2 ) ;
35 ac tua l = zeros ( l i n e s c a n p i x e l s , 2 ) ;
36 l a t ency = 1 ;
37 for k = 1 : l i n e s
38 for j = 0 : l i n e s c a n p i x e l s −1−l a t ency
39 o r i g i n a l = i npu t h i s t o r y { j +1};
40 computed = output h i s t o ry { j+1+la t ency } ;
41 inp = f i ( o r i g i n a l (2 ) , 0 , 13 , 0 ) ;
42 comp = f i ( computed (2 ) , 0 , 10 , 0 ) ;
43
44 i g n o r e f l a g ( j +1) = outpu t f l a g { j+1+la t ency } ;
45
46 sim ( j +1 , : ) = [ inp comp ] ;
47 l i n e p i x = f i ( j , 0 , 1 3 , 0 ) ;
48 act = f loor ( ( l i n e p i x+o f f s e t ) ∗ p i x r a t i o ) ;
49 ac tua l ( j +1 , : ) = [ l i n e p i x act ] ;
50
51 i f act ˜= comp
52 e r r o r s = e r r o r s + 1 ;
145
53 end ;
54 end ;
55 end ;
56
57 plot ( sim ( : , 1 ) ’ , sim ( : , 2 ) ’ , ’r’ ) ;
58 hold on ;
59 plot ( ac tua l ( : , 1 ) ’ , a c tua l ( : , 2 ) ’ , ’*’ ) ;
60
61 save t e s t
62 clear
63 load t e s t
1 %
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 % o b j e c t s t b .m
3 % Testbench f o r c l a s s i f i c a t i o n o f o b j e c t s . U t i l i z e s two o b j e c t s .
4 %
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
5 i f (˜ exist (’class_data’ ,’var’ ) )
6 [ Img , s t a r t p i x , end pix ] = edge bu i l d e r (’objects_simple.png’ ) ;
7 for k = 1:50
8 data ( : , : , k ) = x l s r ead (’results_luckycharm.xlsx’ , k+20,’B2:
F65’ ) ;
9 end ;
10 c l a s s d a t a = s i n g l e ( data ) ;
11 end ;
12
13 clc ;
14 c l a s s e s = 5 ;
15 l i n e s = 192∗10;%648;
16 hyp e r l i n e s = 10 ;
17 l i n e s c a n p i x e l s = 1536 ;
18 hype r p i x e l s = 64 ;
19 p i x e l r a t i o = l i n e s c a n p i x e l s / hype r p i x e l s ; %24
20 l i n e r a t i o = l i n e s / hyp e r l i n e s ; %192
21 num objects = 2 ;
22
23 ob j e c t s hd l = hdlcos im top ; % Set up s imu la t i on o b j e c t
24
25 ob j e c t = f i ( 0 , 0 , 54 , 0 ) ;
26 o f f s e t = f i ( 0 , 1 , 13 , 0 ) ;
27 p i x r a t i o = f i ( (1/ p i x e l r a t i o ) , 0 , 32 ,32 ) ;
28 l i n e r a t = f i ( (1/ l i n e r a t i o ) , 0 , 32 ,32 ) ;
29
30 f r ame f l a g = f i ( 1 , 0 , 1 , 0 ) ;
146
31 pixel num = f i ( 0 , 0 , 10 , 0 ) ;
32 c u r r e n t p i x e l = 0 ;
33 da ta t r a cke r = 1 ;
34 new = 0 ;
35 in1 = f i ( 0 , 0 , 32 , 0 ) ;
36 in2 = f i ( 0 , 0 , 32 , 0 ) ;
37 in3 = f i ( 0 , 0 , 32 , 0 ) ;
38 in4 = f i ( 0 , 0 , 32 , 0 ) ;
39 in5 = f i ( 0 , 0 , 32 , 0 ) ;
40 c u r r e n t o b j l i n e = 345 ;%271;
41 for K = 0 : hyp e r l i n e s %10
42 for M = 0: hype r p i x e l s %64
43 i f ( c u r r e n t p i x e l == 64)
44 da ta t r a cke r = data t r a cke r + 1 ;
45 c u r r e n t p i x e l = 0 ;
46 c u r r e n t o b j l i n e = c u r r e n t o b j l i n e + 1 ;
47 end ;
48 pixel num = f i ( cu r r en t p i x e l , 0 , 8 , 0 ) ;
49 in1 . hex = num2hex( c l a s s d a t a ( c u r r e n t p i x e l +1 ,1 ,
da ta t r a cke r ) ) ;
50 in2 . hex = num2hex( c l a s s d a t a ( c u r r e n t p i x e l +1 ,2 ,
da ta t r a cke r ) ) ;
51 in3 . hex = num2hex( c l a s s d a t a ( c u r r e n t p i x e l +1 ,3 ,
da ta t r a cke r ) ) ;
52 in4 . hex = num2hex( c l a s s d a t a ( c u r r e n t p i x e l +1 ,4 ,
da ta t r a cke r ) ) ;
53 in5 . hex = num2hex( c l a s s d a t a ( c u r r e n t p i x e l +1 ,5 ,
da ta t r a cke r ) ) ;
54 c u r r e n t p i x e l = cu r r e n t p i x e l + 1 ;
55
56 new re su l t s = f i ( 1 , 0 , 1 , 0 ) ;
57 for J = 1 : ( l i n e r a t i o / hype r p i x e l s )%5 l i n e s
58 for X = 1 : num objects
59 i f J ˜= 1 | | X ˜= 1
60 new re su l t s = f i ( 0 , 0 , 1 , 0 ) ;
61 end ;
62 ob j e c t = b i t conca t ( f i (K, 0 , 3 2 , 0 ) , f i (X, 0 , 6 , 0 ) , f i (
f loor ( p i x r a t i o ∗ s t a r t p i x ( c u r r e n t o b j l i n e ,X) )
, 0 , 8 , 0 ) , f i ( f loor ( p i x r a t i o ∗ end pix (
c u r r e n t o b j l i n e ,X) ) , 0 , 8 , 0 ) ) ;
63 % Run data in t o system
64 [ out1 , out2 , out3 , out4 , out5 , objectnum ] = step (
ob j e c t s hd l , ob ject , new resu l t s , pixel num , in1
, in2 , in3 , in4 , in5 ) ;
65 end ;
66 end ;
67 end ;
147
68 end ;
69
70 save t e s t . mat
71 clear ;
72 load t e s t . mat
1 %
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 % norma l i z e t b .m
3 % Testbench f o r normal ize component
4 %
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
5 % Sta r t i n i t i a l i z a t i o n
6 i f (˜ exist (’data’ , ’var’ ) )
7 load (’data.mat’ ) ;
8 end ;
9
10 clc ;
11 i t e r a t i o n s = 1 ;
12 rows = 64 ;
13 columns = 64 ;
14 product hd l = hdlcos im top ; % Set up s imu la t i on o b j e c t
15
16 for K = 1 : i t e r a t i o n s
17 for J = 0 : columns − 1
18 for I = 0 : rows − 1
19 data in = data ( I+1, J+1) ;
20 darkin = dark ( I+1, J+1) ;
21 l i g h t i n = l i g h t I ( I+1, J+1) ;
22 meanin = means (1 , J+1) ;
23 s tddev in = stddevI (1 , J+1) ;
24 i n pu t h i s t o r y { I+1,J+1} = [ datain , darkin , l i g h t i n ,
meanin , s tddev in ] ;
25 % Run data in t o system
26 [ normal ized ] = step ( normal i ze hd l , datain , darkin ,
l i g h t i n , meanin , s tddev in ) ;
27 ou tput h i s t o ry { I+1,J+1} = [ normal ized ] ;
28 end ;
29 end ;
30 end ;
31
32 % la t ency = 4 ( su b t r a c t i on ) + 1 ( comparison ) + 3 (mult ) + 1 (
comparison ) +
33 % 4 ( su b t r a c t i on ) + 3 (mult )
34 l a t ency = 16 ;
148
35 for I = 1 : rows+columns−l a t ency
36 inputs = inpu t h i s t o r y { I }
37 normal ized ( I ) = output h i s t o ry { I+la t ency }
38
39 ac tua l ( I ) = normal ize ( inputs (1 ) , inputs (2 ) , inputs (3 ) , inputs
(4 ) , inputs (5 ) )
40 end ;
41
42 % Output r e s u l t s to f i l e
43 T = tab l e ( normalized ’ , actua l ’ ) ;
44 wr i t e t ab l e (T, ’normalize.xlsx’ , ’Range’ , ’B2’ , ’
WriteVariableNames’ , f a l s e ) ;
1 %
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 % inne r p r oduc t t b .m
3 % Testbench f o r inner produc t component
4 %
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
5 % Sta r t i n i t i a l i z a t i o n
6 i f (˜ exist (’data’ , ’var’ ) )
7 load (’data.mat’ ) ;
8 end ;
9
10 clc ;
11 i t e r a t i o n s = 1 ;
12 rows = 200 ;
13 columns = 1 ;
14 product hd l = hdlcos im top ; % Set up s imu la t i on o b j e c t
15 M = 0;
16 for K = 1 : i t e r a t i o n s
17 for J = 0 : columns − 1
18 for I = 0 : 4 : ( rows ) − 1
19 norm1 = normal ize ( data ( I+1,J+1) , dark ( I+1,J+1) ,
l i g h t I ( I+1,J+1) , means (1 , I+1) , s tddevI (1 , I+1) ) ;
20 norm2 = normal ize ( data ( I+2,J+1) , dark ( I+2,J+1) ,
l i g h t I ( I+2,J+1) , means (1 , I+2) , s tddevI (1 , I+2) ) ;
21 norm3 = normal ize ( data ( I+3,J+1) , dark ( I+3,J+1) ,
l i g h t I ( I+3,J+1) , means (1 , I+3) , s tddevI (1 , I+3) ) ;
22 norm4 = normal ize ( data ( I+4,J+1) , dark ( I+4,J+1) ,
l i g h t I ( I+4,J+1) , means (1 , I+4) , s tddevI (1 , I+4) ) ;
23 c l a s s 1 = c l a s s (1 , I+1) ;
24 c l a s s 2 = c l a s s (1 , I+2) ;
25 c l a s s 3 = c l a s s (1 , I+3) ;
26 c l a s s 4 = c l a s s (1 , I+4) ;
149
27 i n pu t h i s t o r y {M+1,J+1} = [ norm1 , norm2 , norm3 , norm4 ,
c l a s s 1 , c l a s s 2 , c l a s s 3 , c l a s s 4 ] ;
28 % Run data in t o system
29 [ p a r t i a l 1 , pa r t i a l 2 , pa r t i a l 3 , pa r t i a l 4 , sum out ] = step (
product hdl , norm1 , norm2 , norm3 , norm4 , c l a s s 1 ,
c l a s s 2 , c l a s s 3 , c l a s s 4 ) ;
30 ou tput h i s t o ry {M+1,J+1} = [ pa r t i a l 1 , pa r t i a l 2 , pa r t i a l 3 ,
pa r t i a l 4 , sum out ] ;
31 M=M+1;
32 end ;
33 end ;
34 end ;
35
36 % la t ency = 5 ( inner product )
37 % la t ency = 21 ( channel sum)
38 prev ious1 = s i n g l e (0 ) ;
39 prev ious2 = s i n g l e (0 ) ;
40 prev ious3 = s i n g l e (0 ) ;
41 prev ious4 = s i n g l e (0 ) ;
42 l a t ency = 14 ; %26;
43 for J=0: columns−1
44 for I = 0 : ( rows/4−1)−l a t ency
45 K=4∗ I ;
46 inputs = inpu t h i s t o r y { I+1,J+1};
47 sim = output h i s t o ry { I+1+latency , J+1}
48
49 norms = [ normal ize ( data (K+1,J+1) , dark (K+1,J+1) , l i g h t I (K
+1,J+1) , means (1 ,K+1) , s tddevI (1 ,K+1) ) . . .
50 normal ize ( data (K+2,J+1) , dark (K+2,J+1) , l i g h t I (K
+2,J+1) , means (1 ,K+2) , s tddevI (1 ,K+2) ) . . .
51 normal ize ( data (K+3,J+1) , dark (K+3,J+1) , l i g h t I (K
+3,J+1) , means (1 ,K+3) , s tddevI (1 ,K+3) ) . . .
52 normal ize ( data (K+4,J+1) , dark (K+4,J+1) , l i g h t I (K
+4,J+1) , means (1 ,K+4) , s tddevI (1 ,K+4) ) ] ;
53 actual sum1 = inner product ( inputs (1 ) , inputs (5 ) ,
prev ious1 ) ;
54 actual sum2 = inner product ( inputs (2 ) , inputs (6 ) ,
prev ious2 ) ;
55 actual sum3 = inner product ( inputs (3 ) , inputs (7 ) ,
prev ious3 ) ;
56 actual sum4 = inner product ( inputs (4 ) , inputs (8 ) ,
prev ious4 ) ;
57 prev ious1 = actual sum1 ;
58 prev ious2 = actual sum2 ;
59 prev ious3 = actual sum3 ;
60 prev ious4 = actual sum4 ;
61
150
62 tota l sum = actual sum1 + actual sum2 + actual sum3 +
actual sum4
63 end ;
64 end ;
1 %
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− −−−−−−−−−−
2 % re g r e s s i o n t b .m
3 % Testbench f o r r e g r e s s i on system
4 %
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
5 % Sta r t i n i t i a l i z a t i o n
6 i f (˜ exist (’datafi’ , ’var’ ) )
7 load (’test_data.mat’ ) ;
8 end ;
9
10 clc ;
11 l i n e s = 85 ;
12 bands = 160 ;
13 samples = 110 ; %1024;
14 c l a s s e s = 5 ; %20;
15
16 r e g r e s s i o n hd l = hdlcos im top ; % Set up s imu la t i on o b j e c t
17
18 in = f i ( 0 , 0 , 98 , 0 ) ;
19
20 %wr i t e i n t e r c e p t s
21 for K = 1 : c l a s s e s
22 address = b i t conca t ( f i ( 1 , 0 , 14 , 0 ) , f i (K, 0 , 1 0 , 0 ) , f i ( 0 , 0 , 8 , 0 ) ) ;
23 data = f i ( 0 , 0 , 32 , 0 ) ;
24 data . hex = num2hex( c l a s s (1 ,K) ) ;
25 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i
( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , address , data ) ;
26 end ;
27
28 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 )
, f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 32 , 0 ) , f i ( 0 , 0 , 32 , 0 ) ) ;
29
30 for K = 1 : c l a s s e s
31 for J = 1 : ( bands /5) ∗8
32 % Address genera t ion
33 address = b i t conca t ( f i ( 1 , 0 , 14 , 0 ) , f i (K, 0 , 1 0 , 0 ) , f i ( J , 0 , 8 , 0 )
) ;
34 data = f i ( 0 , 0 , 32 , 0 ) ;
35 data . hex = num2hex( c l a s s ( J+1,K) ) ;
151
36
37 % Write c l a s s e s
38 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i
( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , address , data ) ;
39 end ;
40 end ;
41
42 %Empty c l o c k c y c l e s
43 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 )
, f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 32 , 0 ) , f i ( 0 , 0 , 32 , 0 ) ) ;
44 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 )
, f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 32 , 0 ) , f i ( 0 , 0 , 32 , 0 ) ) ;
45 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 )
, f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 32 , 0 ) , f i ( 0 , 0 , 32 , 0 ) ) ;
46
47 %WRITE MEANS
48 for J = 1 : ( bands /5) ∗8
49 % Address genera t ion
50 address = b i t conca t ( f i ( 1 , 0 , 22 , 0 ) , f i ( J−1 ,0 ,10 ,0) ) ;
51 data = f i ( 0 , 0 , 32 , 0 ) ;
52 data . hex = num2hex(means (1 , J ) ) ;
53
54 % Write means
55 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i
( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , address , data ) ;
56 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i
( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , address , data ) ;
57 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i
( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , address , data ) ;
58 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i
( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , address , data ) ;
59 end ;
60
61 % Empty c l o c k c y c l e s
62 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 )
, f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 32 , 0 ) , f i ( 0 , 0 , 32 , 0 ) ) ;
63 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 )
, f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 32 , 0 ) , f i ( 0 , 0 , 32 , 0 ) ) ;
64 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 )
, f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 32 , 0 ) , f i ( 0 , 0 , 32 , 0 ) ) ;
65
66 %WRITE STDDEVI
67 for J = 1 : ( bands /5) ∗8
68 % Address genera t ion
69 address = b i t conca t ( f i ( 4 , 0 , 22 , 0 ) , f i ( J−1 ,0 ,10 ,0) ) ;
70 data = f i ( 0 , 0 , 32 , 0 ) ;
71 data . hex = num2hex( s tddevI (1 , J ) ) ;
152
72
73 % Write s t dd e v I
74 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i
( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , address , data ) ;
75 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i
( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , address , data ) ;
76 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i
( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , address , data ) ;
77 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i
( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , address , data ) ;
78 end ;
79
80 % Empty c l o c k c y c l e s
81 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 )
, f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 32 , 0 ) , f i ( 0 , 0 , 32 , 0 ) ) ;
82 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 )
, f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 32 , 0 ) , f i ( 0 , 0 , 32 , 0 ) ) ;
83 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 )
, f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 32 , 0 ) , f i ( 0 , 0 , 32 , 0 ) ) ;
84 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 )
, f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 32 , 0 ) , f i ( 0 , 0 , 32 , 0 ) ) ;
85
86 %READ CLASSES
87 for K = 1 : c l a s s e s
88 for J = 1 : ( bands /5)∗8+1
89 % Address genera t ion
90 address = b i t conca t ( f i ( 1 , 0 , 14 , 0 ) , f i (K, 0 , 1 0 , 0 ) , f i ( J
−1 ,0 ,8 ,0) ) ;
91 % Read c l a s s e s
92 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i
( 1 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , address , f i ( 0 , 0 , 32 , 0 ) ) ;
93 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i
( 1 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , address , f i ( 0 , 0 , 32 , 0 ) ) ;
94 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i
( 1 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , address , f i ( 0 , 0 , 32 , 0 ) ) ;
95 [ ˜ , c l a s s r e a d (J ,K) ] = step ( r e g r e s s i o n hd l , in , in , in ,
in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , address
, f i ( 0 , 0 , 32 , 0 ) ) ;
96 end ;
97 end ;
98 T = tab l e ( c l a s s r e ad , c l a s s ) ;
99 wr i t e t ab l e (T, ’classes.xlsx’ , ’Range’ , ’B1’ ) ;
100
101 %READ MEANS
102 for J = 1 : ( bands /5) ∗8
103 % Address genera t ion
104 address = b i t conca t ( f i ( 1 , 0 , 22 , 0 ) , f i ( J−1 ,0 ,10 ,0) ) ;
153
105 % Read means
106 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i
( 1 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , address , f i ( 0 , 0 , 32 , 0 ) ) ;
107 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i
( 1 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , address , f i ( 0 , 0 , 32 , 0 ) ) ;
108 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i
( 1 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , address , f i ( 0 , 0 , 32 , 0 ) ) ;
109 [ ˜ , mean read (J , 1 ) ] = step ( r e g r e s s i o n hd l , in , in , in , in , in
, f i ( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , address , f i
( 0 , 0 , 32 , 0 ) ) ;
110 end ;
111 T = tab l e (mean read , means ’ ) ;
112 wr i t e t ab l e (T, ’means.xlsx’ , ’Range’ , ’B1’ ) ;
113
114 %READ STDDEVI
115 for J = 1 : ( bands /5) ∗8
116 % Address genera t ion
117 address = b i t conca t ( f i ( 4 , 0 , 22 , 0 ) , f i ( J−1 ,0 ,10 ,0) ) ;
118 % Read s t dd e v I
119 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i
( 1 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , address , f i ( 0 , 0 , 32 , 0 ) ) ;
120 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i
( 1 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , address , f i ( 0 , 0 , 32 , 0 ) ) ;
121 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i
( 1 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , address , f i ( 0 , 0 , 32 , 0 ) ) ;
122 [ ˜ , s tddev read (J , 1 ) ] = step ( r e g r e s s i o n hd l , in , in , in , in ,
in , f i ( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , address , f i
( 0 , 0 , 32 , 0 ) ) ;
123 end ;
124 T = tab l e ( stddev read , stddevI ’ ) ;
125 wr i t e t ab l e (T, ’stddevs.xlsx’ , ’Range’ , ’B1’ ) ;
126 %Set Enable
127 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 )
, f i ( 1 , 0 , 1 , 0 ) , f i ( 0 , 0 , 32 , 0 ) , f i ( 1 , 0 , 32 , 0 ) ) ;
128 % Set In t e r rup t Enable
129 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 )
, f i ( 1 , 0 , 1 , 0 ) , f i ( 1 , 0 , 32 , 0 ) , f i ( 1 , 0 , 32 , 0 ) ) ;
130 % End i n i t i a l i z a t i o n
131
132 for K = 1 : l i n e s
133 sum = s i n g l e ( zeros ( samples , c l a s s e s ) ) ;
134
135 for J = 0 : samples − 1
136 for I = 0 : 5 : bands − 1
137 % Sta r t t e s t data genera t ion
138 in1 = b i t conca t ( f i ( I , 0 , 8 , 0 ) , f i ( J , 0 , 1 0 , 0 ) , d a t a f i (K, I
+1,J+1) , l i g h t I f i ( I+1,J+1) , d a r k f i ( I+1,J+1) ) ;
154
139 in2 = b i t conca t ( f i ( I +1 ,0 ,8 ,0) , f i ( J , 0 , 1 0 , 0 ) , d a t a f i (K, I
+2,J+1) , l i g h t I f i ( I+2,J+1) , d a r k f i ( I+2,J+1) ) ;
140 in3 = b i t conca t ( f i ( I +2 ,0 ,8 ,0) , f i ( J , 0 , 1 0 , 0 ) , d a t a f i (K, I
+3,J+1) , l i g h t I f i ( I+3,J+1) , d a r k f i ( I+3,J+1) ) ;
141 in4 = b i t conca t ( f i ( I +3 ,0 ,8 ,0) , f i ( J , 0 , 1 0 , 0 ) , d a t a f i (K, I
+4,J+1) , l i g h t I f i ( I+4,J+1) , d a r k f i ( I+4,J+1) ) ;
142 in5 = b i t conca t ( f i ( I +4 ,0 ,8 ,0) , f i ( J , 0 , 1 0 , 0 ) , d a t a f i (K, I
+5,J+1) , l i g h t I f i ( I+5,J+1) , d a r k f i ( I+5,J+1) ) ;
143 % End t e s t data genera t ion
144
145 % Run data in t o system
146 s tep ( r e g r e s s i o n hd l , in1 , in2 , in3 , in4 , in5 , f i
( 1 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 32 , 0 ) ,
f i ( 0 , 0 , 32 , 0 ) ) ;
147 end ;
148 end ;
149 % Wait f o r i n t e r r u p t
150 i r q = f i ( 0 , 0 , 1 , 0 ) ;
151 while ( i r q . data ˜= 1)
152 [ i rq , ˜ ] = step ( r e g r e s s i o n hd l , in , in , in , in , in , f i
( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , f i ( 0 , 0 , 32 , 0 ) , f i
( 0 , 0 , 32 , 0 ) ) ;
153 end ;
154
155 for M = 1: c l a s s e s
156 for J = 0 : samples − 1
157 % Address genera t ion
158 address = b i t conca t ( f i ( 1 , 0 , 13 , 0 ) , f i (M, 0 , 6 , 0 ) , f i ( J
, 0 , 1 3 , 0 ) ) ;
159
160 % Read r e s u l t s
161 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) ,
f i ( 1 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , address , f i ( 0 , 0 , 32 , 0 ) ) ;
162 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) ,
f i ( 1 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) , address , f i ( 0 , 0 , 32 , 0 ) ) ;
163 [ ˜ , sum( J+1,M) ] = step ( r e g r e s s i o n hd l , in , in , in , in
, in , f i ( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , f i ( 0 , 0 , 1 , 0 ) ,
address , f i ( 0 , 0 , 32 , 0 ) ) ;
164 end ;
165 end ;
166 % Clear In t e r rup t
167 s tep ( r e g r e s s i o n hd l , in , in , in , in , in , f i ( 0 , 0 , 1 , 0 ) , f i
( 0 , 0 , 1 , 0 ) , f i ( 1 , 0 , 1 , 0 ) , f i ( 2 , 0 , 32 , 0 ) , f i ( 1 , 0 , 32 , 0 ) ) ;
168 % Write r e s u l t s and expec ted to f i l e
169 T = tab l e (sum) ;
170 wr i t e t ab l e (T, ’results_lc.xlsx’ , ’Sheet’ , K, ’Range’ , ’B1’ ) ;
155
171 [ model , exact ] = c a l c u l a t i o n t e s t ( d a t a f i (K, 1 : bands , 1 :
samples ) , dark , l i g h t I , means test , s t ddev I t e s t ,
c l a s s t e s t ( : , 1 : c l a s s e s ) ,K) ;
172
173 end ;
174
175 save t e s t d a t a
176 clear ;
177 load t e s t d a t a
1 %
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 % normal ize .m
3 % Compute the normal va lue as done in l o g i s t i c r e g r e s s i on
c a l c u l a t i o n
4 %
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
5 function normal ized = normal ize ( data , dark , l i g h t I , mean, s tddevI
)
6 d i f f = max( s i n g l e ( data − dark ) , s i n g l e (0 ) ) ;
7 co r r e c t ed = min( s i n g l e ( d i f f .∗ l i g h t I ) , s i n g l e (1 ) ) ;
8 normal ized = s i n g l e ( ( c o r r e c t ed − mean) .∗ s tddevI ) ;
1 %
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 % inner produc t .m
3 % Compute the inner product as done in l o g i s t i c r e g r e s s i on
c a l c u l a t i o n
4 %
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
5 function par t i a l sum = inner product ( normalized , c l a s s , p rev ious )
6 product = s i n g l e ( normal ized ∗ c l a s s ) ;
7 par t i a l sum = s i n g l e ( product + prev ious ) ;
1 %
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 % c a l c u l a t i o n t e s t .m
3 % Compute the p r o b a b i l i t y us ing l o g i s t i c r e g r e s s i on and wr i t e to
spreadshee t
4 %
−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
156
5 function [ model , exact ] = c a l c u l a t i o n t e s t ( da ta f i , dark , l i g h t I ,
mean in , s tddev I in , c l a s s i n , shee t )
6 [ ˜ , c l a s s e s ] = s ize ( c l a s s i n ) ;
7 [ ˜ , rows , columns ] = s ize ( d a t a f i ) ;
8 part ia l sum mode l = s i n g l e ( zeros ( columns , c l a s s e s ) ) ;
9 pa r t i a l sum exac t = zeros ( columns , c l a s s e s ) ;
10 for M = 1: c l a s s e s
11 for J = 1 : columns
12 prev ious mode l = c l a s s i n (1 ,M) ; %in t e r c e p t
13 p r ev i ou s a c tua l = double ( c l a s s i n (1 ,M) ) ;
14 for I = 1 : rows
15 norm = normal ize ( s i n g l e ( d a t a f i (1 , I , J ) ) , dark ( I , J ) ,
l i g h t I ( I , J ) , mean in ( I ) , s t ddev I i n ( I ) ) ;
16 part ia l sum mode l (J ,M) = inner product (norm,
c l a s s i n ( I+1,M) , prev ious mode l ) ;
17 prev ious mode l = part ia l sum mode l (J ,M) ;
18
19 pa r t i a l sum exac t (J ,M) = (min(max( double ( d a t a f i
(1 , I , J ) ) − double ( dark ( I , J ) ) , 0) .∗ double (
l i g h t I ( I , J ) ) , 1) . . .
20 − double ( mean in ( I ) ) ) .∗ double ( s t ddev I i n ( I )
) .∗ double ( c l a s s i n ( I+1,M) ) +
pr ev i ou s a c tua l ;
21 p r ev i ou s a c tua l = pa r t i a l sum exac t (J ,M) ;
22 end ;
23 end ;
24 end ;
25 model = part ia l sum mode l ;
26 exact = par t i a l sum exac t ;
27 T = tab l e (model , exact ) ;
28 wr i t e t ab l e (T, ’results_lc.xlsx’ , ’Sheet’ , sheet , ’Range’ , ’P1
’ ) ;