

### **Departamento de Ingeniería Electrónica**

Tesis Doctoral

# **Development of a data acquisition system using silicon detectors for PET applications**

*Autor:*

Vera Koleva Stankova

*Directores:*

Dr. Vicente González Millán Dr. Carlos Lacasta Llácer Dra. Gabriela Llosá Llácer

*Noviembre 2015*

# **Contents**







# <span id="page-8-0"></span>Resumen

Este trabajo describe el desarrollo de parte de la electrónica elaborada para el diseño de un escáner de Tomografía de Emisión de Positrones (PET) denominado Petete. Dicho escáner debe identificar offline los eventos de coincidencia y utilizar la técnica de ToF (Time of Flight) para descartar el ruido de fondo, lo cual permite contribuir a la mejora de la relación señal-ruido (SNR) y por lo tanto al aumento de la calidad de las imágenes médicas. El principal uso del escáner PET será en la investigación para el estudio y prueba de diferentes detectores para la mejora de las prestaciones del escáner PET en términos de resolución espacial, tiempo de adquisición (lo cual implica la reducción del tiempo de exposición del paciente a la radiación), la sensibilidad y calidad de imagen.

El escáner conste en 16 módulos de detectores, basados en fotomultiplicadores de silicio, contando con un total de 1024 canales. Para poder recoger la información de ToF, la electrónica de proximidad (Front-end) debe registrar el tiempo de llegada de los eventos válidos detectados con una precisión del orden de cientos de picosegundos. Dado el número no despreciable de canales, y el reducido espacio disponible, la electrónica Front-end debe estar basada en un circuito integrado de aplicación específica (ASIC). Cada módulo de detectores se ubica en una tarjeta denominada tarjeta híbrida, que contiene al menos un ASIC para el registro del tiempo de llegada. Para el presente trabajo se han identificado y trabajado con dos ASICs que se adecúan a las necesidades del escáner: el Vata64hdr16 y el STiC.

La electrónica desarrollada consta de dos partes: Por una parte se ha desarrollado completamente el sistema de adquisición de datos que realiza la lectura de los detectores de silicio, incluyendo tanto el hardware como el firmware necesario. Esta tarjeta de adquisición es la encargada de controlar los ASICs, realizar proceso de adquisición de datos, gestionar la comunicación con el ordenador y llevar a cabo la transferencia de datos. Para cubrir el esc´aner completo, son necesarias en total cuatro tarjetas de adquisición de datos que deben trabajar en paralelo, cubriendo cada una un total de 256 canales. El sistema se controla por un programa software diseñado para esta aplicación e instalado en un ordenador. El sistema de adquisición de datos está diseñado para que sea compacto, flexible, r´apido y adaptable a las ASICs mencionadas. Por otra parte, es importante destacar que una parte del presente trabajo se ha dedicado al desarrollo de parte de la electrónica digital de STiC. Este trabajo se ha desarrollado en la Universidad de Heidelberg (Alemania) y ha permitido profundizar en el desarrollo de un sistema de adquisición de datos en este caso desde el punto de vista de la síntesis de un ASIC.

La electrónica y el software implementado en el sistema satisfacen completamente las necesidades del escáner Petete, lo que constituye un sistema multi-configurable con transmisión de datos rápida a través de Gigabit Ethernet. El diseño se ha realizado de forma que se pueden seleccionar diferentes configuraciones, tales como diferentes modos de lectura, diferentes opciones de prueba y configuración separada para cada tarjeta híbrida. Las pruebas experimentales llevadas a cabo verifican el comportamiento funcional correcto de todos los sub-sistemas, tales como ADC, DAC, TDC, triggers, señales de control, comunicación y otros, como se explica en la memoria presentada.

Está previsto que el sistema sea utilizado para la investigación en el laboratorio de diferentes sensores de silicio y centelladores, dado que el sistema se ha diseñado de formare configurable y f´acil de adaptar con los nuevos detectores. Hasta este momento la tarjera HDRDAQ se ha testado con dos tarjetas h´ıbridas con 64 canales cada uno. En un futuro próximo están previstas las pruebas del sistema completo con cuatro tarjetas hibridas y con cuatro m´odulos detectores. Otras pruebas planeadas son el uso de varias tarjetas HDRDAQ en paralelo trabajando de modo sincronizado para cubrir el número de módulos detectores del scanner completo.

La estructura del presente trabajo es la siguiente: En el primer capítulo se han estudiado las características de los detectores, además de describir el escáner Petete y definir los requerimientos del sistema de adquisición de datos. En el capítulo 2 se ha dado una introducción a los fotomultiplicadores de silicio y a las características de los ASICs con los que se ha trabajado: el Vata64hdr16 y el STiC. Además también se ha llevado a cabo el desarrollo de las tarjetas hibridas que forman los módulos del escáner PET. El capítulo 3 se centra en el chip STiC y en el desarrollo de la electrónica digital del diseño ASIC que se ha llevado a cabo. En el capítulo 4 se desarrolla de forma detallada la electrónica de adquisición que lleva a cabo el proceso de control de los chips y la comunicación con el ordenador. Para el diseño de tarjeta de adquisición se ha tenido en cuenta la geometría del escáner, el número de las tarjetas hibridas necesarias que hay que controlar y los requisitos específicos de los ASICs. Para controlar el escáner y la electrónica desde el ordenador se ha desarrollado un programa específico. El capítulo 5 está dedicado al desarrollo firmware realizado, y el capítulo 6 se describe brevemente el software. El ultimo capitulo se ha dedicado al desarrollo de las pruebas en el laboratorio para verificar la funcionalidad de sistema con sus diferentes partes como el software, electrónica y detectores. Finalmente se incluyen las conclusiones del trabajo completo.

# <span id="page-10-0"></span>Preface

In the last decade a great effort has been devoted to the development of Position Emission Tomography (PET) devices. In the conventional approach, the 511keV annihilation photons emitted from a patient or small animal are detected by a ring of scintillators such as LYSO coupled to arrays of photon detectors. Although this has been successful in achieving 5mm FWHM spatial resolution in human studies and 1mm resolution in dedicated small animal instruments, there is interest in significantly improving these figures. A small animal PET scanner requires better spatial resolution than a dedicated PET for human diagnostic, as the size of the organs to be studied is significantly smaller than those considered for human studies. At the same time energy resolution must be preserved in order to maintain an acceptable image quality.

Silicon Photomultipliers (SiPMs) are novel kind of solid state photon detectors. They are composed of thousands of avalanche photon diode pixels connected in parallel. SiPMs have the same multiplication gain compared to the conventional photomultipliers (PMTs), commonly used in the commercials PET scanner. They have and extremely high photon detection efficiency and advantages such as compactness, relatively low bias voltage and magnetic field immunity. Because of this, they have become excellent candidates for building PET devices. Special readout electronics is required to preserve the high performance of the detector.

This work is divided in two parts. One is the dedicated to the digital design of the STiC ASIC and the other to the development of a data acquisition system for the readout process of SiPMs detectors used for the construction of full-ring small animal PET scanner. The scanner consists of 16 detector heads each one placed on a custom made hybrid boards together with the front-end electronics which is an application-specific integrated circuit (ASIC). The main goal of this work is the design of a data acquisition board to accomplish the readout process of the detector heads. The board is designed to be compatible with both ASICs: Vata64hdr16 and STiC. The idea to include the STiC chip is promoted by the option to leave the possibility to evaluate and test the new developed chip. The STiC chip adapts very well to the scanner necessity and his 64 channel version is recently available. This is the reason why both ASICs are considered for the design of the main board. Another main function of the acquisition board is to control the ASICs and the acquisition process and to manage the communication with the PC and data transfer. The board has to register the time of arrival of the valid detected events. This will add a precision in the data analysis for the identification of the coincidence events and discard a background noise. Further this will contribute to improving the signal to noise ratio (SNR) and thus to the medical image quality. Four

data acquisition boards have to work in parallel to cover the number of used detector heads or the 1024 channels. The system is controlled by a custom designed software program installed on a PC. The system also has to be compact, flexible, fast adaptable to different detectors heads and easy to use. The main use of the PET scanner will be research and laboratory measurements for study and test of different detectors for improving of the performances of the PET scanner in terms of spatial resolution, time of acquisition, sensitivity and image quality.

A brief overview of the layout of this work:

Chapter 1 is divided in two main parts. The first one introduce the Positron Emission Tomography (PET) nuclear medical imaging techniques, since it is the subject of main interest in the present work. The second part specify the proposed full-ring small animal PET scanner, object for the developed electronics and specifies the requirements for the system.

Chapter 2 explains the working principle of the Silicon Photomultiplier (SiPM) detectors and their characteristics. It also describes the front-end electronics with Vata64hdr16 ASIC and the description of the front-end board.

Chapter 3 is dedicated to the STiC ASIC. It explain it working principles and design.

Chapter 4 presents the design of the main acquisition board and the function of its internal block.

Chapter 5 describes the firmware of the acquisition board and its functional blocks.

Chapter 6 introduces the software program developed for the application.

Chapter 7 presents the results of laboratory measurements test with the designed electronics.

Finally, the conclusions of this work and possible future research lines are included.

# <span id="page-12-0"></span>Chapter 1

## Introduction

Medical Imaging aims to visualize biological structures and/or functions within the body and has been used as an essential element of medical care since 1895 when the first imaging technique emerged thanks to the discovery of X-rays.

Today medical imaging diagnostics covers an important area of science and technology. The field can be divided in two parts depending on the origin of the applied radiation. If the radiation is applied from outside the patient, techniques such as tomography, x-rays or computer tomography (CT) are considered. On the other hand, imaging technologies like electrocardiography imaging (ECGI), positron emission tomography (PET), single photon emission computed tomography (SPECT) or magnetic resonance imaging (MRI) belong to the group in which radiations are originated inside the body.

Functional imaging refers to methods that are capable of obtaining information about metabolic processes in regions of the body, while structural imaging make the structure of the region of interest observable [\[Dec\]](#page-117-0). There are also methods that combine both techniques and therefore can be functional as well as structural, for example PET-CT, or magnetic resonance imaging (PET-MRI).

Medical imaging opens new ways for the detection and diagnosis of diseases in humans because it allows examining the internal anatomy of the patient without applying surgery or another invasive technique [\[Hen,](#page-118-0) [End\]](#page-118-1).

### <span id="page-12-1"></span>1.1 Nuclear medical imaging

The physical process common to all nuclear medical imaging techniques is the detection and imaging of an ionizing radiation originated within the body. In order to perform this, a radioisotope (or radiotracer) that decays emitting ionizing radiation is incorporated into a biologically active compound (radiopharmaceutical), which is introduced into the body either by injection or inhalation. Inside the body the pharmaceutical compound carries the radioactive isotope to different organs or tissues according to its bio-kinetic properties, and in this way, it accumulates in the patient following a characteristic distribution. This distribution can be tracked and imaged due to the radiation emission of the radiotracer and to the detectors placed near the patient, which record the emitted radiation distribution. Thanks to the imaged distribution, it is possible to detect where certain metabolic processes occur and to measure the concentration changes over the time [\[Hen,](#page-118-0) [Mos\]](#page-115-0). In conventional nuclear imaging, the signals from the detectors are processed to achieve two-dimensional planar images of the three-dimensional distribution of radioactivity inside the patient. In emission tomography, two-dimensional cross-sectional images or even a whole three-dimensional volume are reconstructed from multiple planar images obtained at different angles around the patient [\[Hen\]](#page-118-0).

The two main tomography techniques used in the field of nuclear medical imaging are: Single-Photon Emission Computed Tomography (SPECT) and Positron Emission Tomography (PET) [\[End\]](#page-118-1). The difference between them relies on the physics behind the administrated radiotracer, which affects the detector design. In the first case, as the  $\gamma$ -emitter radioisotope decays, it emits  $\gamma$ -rays, while in the second case, the positronemitter radioisotope releases positrons. These positrons annihilate with electrons of the surrounding producing two  $\gamma$ -rays, which travel in opposite directions. However, these detection systems used in nuclear medical imaging have common characteristics: all detect  $\gamma$ -rays and should be very efficient, since the total radiation dose that can be administrated to the body is limited by the tracer, which requires that the amount of the tracer substance administrated is so small that it does not perturb the system. Due to this safety concern, it is not possible to reduce statistical noise by increasing the activity, and therefore must be obtained by maximizing detection efficiency, which is one of the challenges to achieve in the improvement of new detector system.

### <span id="page-13-0"></span>1.2 Positron emission tomography

One very important method for tomographic imaging in medicine is position emission tomography. Position emission detection systems were developed in the 1950's, but it is in the last 25 years that their performance has considerably improved thanks to the development of new materials, new data acquisition/processing systems and to the introduction of a new positron-emitting radiopharmaceuticals. For example, the introduction of 18F-Fluorodeoxyglucose (FDG) as an oncology radiotracer was the turning point in PET development. This fact increased the application of PET for clinical diagnostics due to its long half live, and even centers without cyclotron could apply this imaging technique [\[Wer\]](#page-118-2).

The PET imaging techniques depend on the employed radioisotopes which are positron emitters. The emitted positron interacts with the surrounding tissue and annihilates with an electron. Then, their combined mass is converted into the energy of two equivalent photons of 511 keV, which are simultaneously emitted in opposite directions, as it is sketched in Figure [1.1.](#page-14-0)



<span id="page-14-0"></span>Figure 1.1: Positron Emission Tomography working principle. The radionuclide emits positrons which annihilates with an electron of the media generating two photons in opposite directions. These photons should be detected in coincidence in order to be acquired.

The task required by the PET scanner is to detect each pair of photons (or as many photons pairs as possible) with good enough information to accurately determine, by means of the data reconstruction, the region where the radiotracer emitted the positron that started the annihilation event. In order to detect these annihilation gamma-rays, each detector module of the PET scanner is operated in time coincidence with those of the opposite side. Because of the position annihilation, it is expected to observe two photons at almost the same time (in coincidence) in the detector ring. The annihilation event, will then be located somewhere on the line connecting the two photon-detection points. This knowledge of the photon direction is a big advantage over single-photon emission computed tomography (SPECT), where collimators have to be used to restrict possible photon directions at the detectors with the disadvantadge of a large reduction in sensitivity.

Several factors make the photon detection not occur at the exactly same time: the annihilation may occur closer to one detector surface than the other, which will result in a slight but measurable delay of one photon, where the photon travel at the speed of light. Most important for temporal mismatches is the finite timing resolution of the detector, its timing uncertainty, which arises from the decay time of the scintillation in the crystal and the processing time of the detector signals. These effects lead to the use of a coincidence time window on the order of 6 to 10ns [\[Lin\]](#page-118-3).

With TOF-PET imaging, the relative time difference  $(\Delta t 1$  and  $\Delta t 2)$  between the detection of the two annihilation photons is used to determine the most likely location of the annihilation event along the Line-Of-Response LOR. Recent development of inorganic scintillators suitable for TOF-PET such as Lutetium Yttrium Ortho-Silicate  $(LYSO)$  and Lanthanum Bromide  $(LaBr_3)$  combined with advances in timing resolution

and timing stability of detector electronics has lead to a resurgence of interest in TOF-PET scanners. Many state-of-the-art detector systems utilize the event timing of detected particles in order to push their sensitivity to the possible limits. By including the time information for coincidence or Time-of-Flight measurements in the analysis, a significant portion of background processes can be distinguished from the desired signal events. This provides a higher signal-to-noise ratio (SNR) and thus increases the sensitivity of the detector system. For nuclear medicine applications, and especially positron emission tomography, the goal is to achieve higher detector sensitivities, faster image reconstruction times and a smaller amount of the required radioactive tracer molecules, without sacrificing the image quality and resolution [\[Har\]](#page-115-1).

### <span id="page-15-0"></span>1.2.1 Reconstruction of the positron origin

Two photons are considered to originate from the same annihilation process if they arrive simultaneously within a coincidence window of a few nanoseconds. For each coincidence event the corresponding (LOR) is recorded. Because of the collinearity, the decay must have occurred on a point along this line. The precision of this information depends on the size of the detector modules. A higher detector granularity with smaller scintillator modules allows to reconstruct a more precise LOR, thereby improving the spatial resolution of the PET system [\[Har\]](#page-115-1).

The remaining momentum of the positron when annihilating disturbs the collinearity and the generated photons are not emitted at precisely 180◦ . Although the deviation from a strict back to back emission is only  $0.25^{\circ}$ , the error on the position reconstruction can be significant, depending on the distance between the coincidence detector modules.

Before being detected in the detector modules, the photons have to pass the surrounding tissue of the patient. Since the energy of the 511keV photons is too small for pair-production, there are only two processes by which they interact with matter. They can be absorbed by the photo-electric effect or by Compton scattering. Both cases can lead to bad identified coincidences that will decrease the image contrast.

Compton scattering with electrons of the surrounding matter can divert one or both annihilation photons from a direct path to the detector modules, as it is shown in Figure [1.2a](#page-16-0). The resulting LOR of such events is displaced and contributes to the signal background. Since the mean free path of a 511 KeV photon in body tissue is about 7cm, the contribution of scattered coincidences to the image reconstruction is significant. To improve the image contrast, these events have to be identified and if possible excluded from the reconstruction data.



<span id="page-16-0"></span>Figure 1.2: Error coincidence signals

During the scattering process, the photon transfers energy to the electron and the remaining energy of the outgoing photon can be calculated by [\[Bog\]](#page-115-2).

$$
E_{\gamma'} = \frac{E_{\gamma}}{1 + \frac{E_{\gamma}}{m_e c^2} (1 - \cos(\phi))}
$$
(1.1)

The equation shows, that the energy depends on the scattering angle  $\phi$ . The maximum energy is  $E_{\gamma}$  for an angle of  $0^{\circ}$ , which corresponds to no scattering, whereas the minimal energy is reached at an angle of 180°. An ideal detector with perfect energy resolution would allow to discard all scattered coincidence events by only selecting coincidence events where both detected photons have an energy of 511 keV. Since this ideal case can never be achieved in a real detector system, a minimum scattering angle exists that the system can resolve from unscattered photons. A better energy resolution allows to define a more precise cut on the photon energy and thus reject smaller scattering angles. Assuming an energy resolution of 10%, the smallest angles the detector can distinguish are approximately 30◦ . A detector with a resolution of 20% can only reject angles larger than 40◦ . This calculation shows that, even with a good resolution, many scattered events are still difficult to distinguish from true coincidence events and additional techniques have to be used to reduce their contribution to the image background [\[Con\]](#page-115-3).

Random or accidental coincidences can occur if more than one positron annihilates within the coincidence time window. If one or two photons from different annihilations are absorbed in the body tissue or are not detected in the PET ring, as it is shown in Figure [1.2b](#page-16-0), the resulting LOR is not related to either annihilation point and contributes to the signal noise. The rate of random coincidences is correlated to the length of the coincidence time window. A longer window will build more coincidences from separate annihilation events than a short window. A high time resolution of the detector allows to constrain this coincidence window to shorter time intervals. However, it has to be large enough to cover the full detector Field-Of-View (FOV), which sets a lower limit to the window length. For a full body PET system with a diameter of 1m the minimal window that does not constrain the FOV is  $1m/c = 3.4$ ns, where c is the speed of light. Due to this minimal coincidence time window, an improved time resolution of the detector will not yield a better rejection of random events beyond the minimal time window. Still, a sufficiently high resolution can be used for Time-of-Flight measurements of the photons to further improve the sensitivity of the detector.

### <span id="page-17-0"></span>1.2.2 Time-of-Flight PET

The image reconstruction of a conventional PET system suffers from a large amount of statistical noise since the annihilation position of the positron along the reconstructed Line-of-Response is unknown. Due to this obscurity, every point along the line has to be considered as a potential origin of the annihilation and included with the same weight in the image reconstruction. By measuring the arrival time of the 511keV photons with a high precision, the decay point along the LOR can be estimated. A direct position reconstruction with a precision of 3mm would require a timing resolution of 10ps which is impossible to achieve with our current technology in a large scale system. Despite the fact that the additional Time-of-Flight information does not improve the spatial resolution, the system can exclude most of the possible annihilation points along a Line-of-Response and improve the contrast as well as reducing acquisition time to form an image.

The additional position information allows to recognize the marked intersections as artificial points and distinguish them from true tracer concentrations. The impact of the timing resolution on the signal-to-noise ratio has been the focus of many research publications [\[Mos,](#page-115-0) [Tom,](#page-115-4) [Cont\]](#page-115-5). They show that the signal-to-noise ratio of the image increases with better timing and thereby spatial resolution of the positron decay.

In addition to the high timing resolution, the features of SiPMs, in particular their small form factor and insensitivity to magnetic fields, have paved the way for the development of new detector concepts. These systems require a combined development effort for the whole detector chain consisting of scintillators, light sensors and readout electronics to fully exploit the possible advantages of the new technologies. The SiPMs are explained in detail in chapter 3.

### <span id="page-17-1"></span>1.2.3 Performance

The important challenge in nuclear medical imaging is to detect as many counts as possible and to localize these counts as accurately as possible. Accurate localization of the γ-ray interaction inside the detector is related to the capability to remove or correct the background events. In order to improve the sensitivity of the scanner, it is necessary to maximize the number of detected counts, and to use an imaging system with high-efficiency detectors and large solid-angle coverage. The system must also be able to run at high counting rates so that no counts are lost due to dead time and to narrow time windows which minimizes accidental coincidences. Therefore, it is worth considering which are the features of interest in the design of a PET system, since they influence on the system performance. In this section, the influences of the folowing features are explained: spatial and energy resolution, scatter fraction, count rate performance and sensitivity.

### <span id="page-18-0"></span>1.2.4 Spatial resolution

The ability of the scanner to reproduce the details of a radionuclide distribution is known as the spatial resolution. Thus, the finer the details an imaging system reproduces, the better spatial resolution it has. It is necessary to differentiate between intrinsic spatial resolution, spatial resolution and reconstructed spatial resolution. The intrinsic spatial resolution is a measure of the uncertainty in the localization of where the annihilation event was produced in the crystal. It is a parameter characteristic of the system scanners design (distance between the scintillator detectors, their dimensions, their stopping power, the incident angle of the photon on the detector, and its depth of interaction, among other parameters) below which it is not possible to manage spatial resolutions. The spatial resolution is determined by the intrinsic spatial resolution plus the effects of non-zero positron range after radioisotope decay, non-colinearity of the annihilation photons due to residual momentum of the positron and the parallax error. Finally the reconstructed spatial resolution takes into account the reconstruction method employed. A given system can provide different spatial resolution depending on the reconstruction method employed. As a consequence, the value for the reconstructed spatial resolution of the scanner should be given together with the employed reconstruction method.

#### <span id="page-18-1"></span>1.2.5 Energy resolution

The ability of a detector to distinguish between two radiations of different energies is known as energy discrimination capability or energy resolution.

Because the amount of light produced in the crystal increases with gamma energy, and more light has less statistical variation, energy resolutions depends on the  $\gamma$ -ray energy. A gamma-ray with higher energy has better energy resolution, although these are not linearly related. In an ideal PET scanner, where only true coincidences are acquired, a well-defined photopeak is obtained at 511 KeV broadened only by the energy resolution of the detectors. However, this photopeak is further broadened since the PET system acquires also scattered events. Indeed, the contribution to lower energetic  $\gamma$ -rays than 511 KeV is mainly due to Compton scattered events. Therefore, in order to remove these accidental events it is necessary to set the proper energy thresholds.

### <span id="page-19-0"></span>1.2.6 Scatter fraction

The scanner sensitivity to scattered events depends on the scanner design. This scanner feature is defined by the scatter fraction of the system, which is given by the ratio of the scatter to total events (including scatter). It is worth mentioning that this parameter depends on the material and size of the scatter medium, the acceptance angle, the setting of energy thresholds, the radiopharmaceutical distribution and the method of estimating the scattered events [\[Bai\]](#page-117-1). It is a factor of relevance in PET since detected scattered events constitute between 20% - 50% of the measured signal.

#### <span id="page-19-1"></span>1.2.7 Count rate performance

This parameter refers to the system capability to acquire events at the source decay rate, including both dead time effects and random coincidence rates, and it assesses on the impact of increasing count rates on image quality. This parameter depends on the scanner design, on the experimental conditions and on the methods for dealing with the measured data. Therefore, another parameter called noise equivalent count rate (NEC) is employed in order to compare the count rate performance between different scanners or the same scanner used for different experiments. The NEC is the count rate which would have resulted in the same single-to-noise ratio in the data in the absence of scatter and random events. It is computed according to equation:

$$
NEC = \frac{T^2}{T + S + 2kR} \tag{1.2}
$$

where T, S and R are trues, scatter and random counting rates, respectively, and k is the fraction of the transverse FOV occupied by the object. Furthermore, in order to assess the NEC, the phantom size, the setting of energy thresholds and the method of scatter fraction estimation have to be considered.

### <span id="page-19-2"></span>1.2.8 Sensitivity

The sensitivity of a scanner tells about its ability to detect the annihilation radiation. Thus, it is calculated as count rate over source activity. The sensitivity of a tomography in determined by a combination of the radius of the detector ring, the axial length of the active volume for acquisition, the total axial length of the tomograph, the stopping power of the scintillation detector elements, the packing fraction of detectors, and other operator-depending settings (e.g., energy threshold, and the method of measurement).

A scanner with low sensitivity will detect a lower number of events and this directly translates to a noisier reconstructed image. For this noisier image to be presented in a diagnostically acceptable form, it needs first to undergo a smoothing process, which results in a loss of spatial resolution.

### <span id="page-20-0"></span>1.3 The small animal PET scanner - Petete

Before building any human PET scanner with new detectors and electronics, the proof of the concept can be tested building a previous small animal PET scanner prototype. In this way the whole process, starting from the bio-markers, passing through mechanics, electronics, software and ending with the obtained medical image can be tested in a small scale with reduced cost and time for the development and building the prototype. However the small animal PET scanners have their own challenges because everything is much smaller. The organs size is reduced, however and the distances are also shorter, which can be advantageous to improve sensitivity but requires a better timing and spatial resolution for the scanner.



<span id="page-20-1"></span>Figure 1.3: Sketch of the full-ring PET prototype. 16 detector modules are readout with 4 DAQ boards.

This work aims to build a data acquisition system for a full-ring small animal PET scanner, called Petete, for research purposes. The scanner employs monolithic SiPMs matrices and continuous LYSO crystals arranged in 16 detector heads. Each head has a SiPM matrix, the scintillator and the front-end electronics or the ASIC. The SiPM matrix has 64 pixels with size 1.4mm x 1.5mm in a 1.5mm pitch with readout pads on two opposite sides. The scintillator crystal is of type LYSO with a size of 12mm x 12mm x 5mm. The detector heads are placed on a hybrid board. Four hybrid boards are connected to a main acquisition board. Four acquisition boards will work synchronous to readout the 16 detector heads. Figure [1.3](#page-20-1) shows a sketch of the proposed scanner geometry.

### <span id="page-21-0"></span>1.3.1 DAQ specifications

The DAQ board so called HDRDAQ has to fulfill the requirements of the Petete scanner. That includes being able to handle several hybrid boards, provide the needed signal for the ASIC control, pre-process and pack the received data and send them to the PC. The system has to be controlled from a PC by a software application in communication with an FPGA which interprets and executes specific commands. Another application of the DAQ board is to be used in laboratory setups for test and characterization of different detectors configurations in combination with different scintillation crystals. In this application the system must be easy re-configurable to work with different hybrid boards. This is also one of the reason to build a modular acquisition system. This creates flexibility to do research to continue improving the detector head.

The requirements for the front-end electronics are:

- $\Diamond$  The front-end electronics has to fit the small available area due high number of channel concentrated within the ring, making the use of an ASIC probably the best option.
- $\Diamond$  It has to handle signals from SiPM detectors and to accept positive and negative signals
- $\Diamond$  It has to provide a good energy resolution
- $\Diamond$  It has to have the possibility to measure the time of the arrival of the photons.

In particular, the requirements for the DAQ board are the following:

- $\Diamond$  The board must be as independent from laboratory equipment as possible.
- $\Diamond$  The board must generate the specific control signals needed from the ASIC. That includes the generation of certain digital and analog signals.
- $\Diamond$  The acquisition process has to be controlled from PC with user-friendly software developed for the application.
- $\Diamond$  The acquired data have to be organized in packets and sent to the PC for further offline analysis.
- $\Diamond$  The board must be able to work in coincidence mode with other boards or setups. This requires a common reference for the time between boards.
- $\Diamond$  The board has to have a *time stamp* mechanism, implemented in the FPGA and used to register and store the time of arrival of validated triggers from each hybrid board with a precision of minimum 1ns.
- $\Diamond$  The board has to be able to generate an analog test pulse for each hybrid board with variable amplitude to allow internal calibration of the ASICs.
- The board has to carry out the readout process of four hybrid boards in parallel.
- The board should have a single input voltage (power supply) and generate the rest of the needed voltage levels internally.
- $\diamond$  The board has to provide fast acquisition processing.
- The board has to feature low power consumption and low noise.

Electronics with these characteristics is not commercially available mainly because a specific front-end electronics is needed to fulfill the scanner requirements. A dedicated readout electronics needs to be developed to cover the number of readout channels and custom developed detector heads with new available detector types.

# <span id="page-24-0"></span>Chapter 2

# Photon detectors and front-end electronics

This chapter is divided in two main sections. The first one introduces the basic operation principle of the photon detectors proposed for the Petete scanner and explains their main characteristic. In the second part of the chapter the front-end electronics dedicated to readout the signals from the detectors is described.

### <span id="page-24-1"></span>2.1 Introduction to Silicon Photomultipliers (SiPMs)

Geiger mode Avalanche PhotoDiode (GAPD), commonly known as Silicon Photomultipliers (SiPMs) are a novel kind of silicon photon detectors. Other commercial names are also applied, such as for example Multi-Pixel Photo counter (MPPC). Compared to the conventional photomultiplier (PMTs) they have several advantages such as small size and MR compatibility. Furthermore, their excellent photon resolving capabilities and good timing performances make them a better solid state photon detector than avalanche photodiodes (APDs) and good candidates for Time- of-Flight (ToF) PET application.

The technological developments in the field of particle detectors have not only led to many important discoveries in particle physics, but are also used in many other research fields such as nuclear medicine. The detection of high energetic photons from positron annihilations is the fundamental principle of positron emission tomography. In combination with a radioactive tracer substance, this technique allows to locate tumor cells in a body, and has become a state of the art diagnosis tool for cancer treatment. The continuous improvement of detector technologies provides an increased sensitivity of these systems and thus resolution of the imaging technique. It also reduces the amount of required radioactive dose injected into the patient. Because of their insensitivity to magnetic fields, SiPMs provide the possibility to integrate PET imaging with magnetic resonance imaging (MRI), allowing to simultaneously record an image of the body tissue overlaid with the information of the metabolic activity. A combined image of these complementary imaging techniques usually requires the acquisition of two separate images, which have to be manually overlaid and aligned. The high timing resolution possible with SiPMs is furthermore exploited in Time- of-Flight PET systems, where the measurement of the arrival time of the annihilation photons is used contributing to significantly increase the sensitivity of the image reconstruction. The full potential of Silicon Photomultipliers can only be provided to a detector system by the development of dedicated readout electronics. Especially the high timing resolution possible with the sensors has not yet been fully uncovered in larger detector systems.

A SiPM device is a silicon pixel array which is composed of hundreds of identical pixels connected in parallel. Each pixel consists of an avalanche photodiode (APD) and a quenching resistor in series as illustrated in Figure [2.1.](#page-25-1) The APDs are biased above the breakdown voltage and thus operated in the so-called Geiger mode. In this mode the avalanche multiplication process cannot be stopped automatically. The quenching resistor provides a local negative feedback to the pixel diode. The large avalanche current will cause a significant voltage drop on the resistor thus reducing the total bias voltage across the resistor. Once it goes back to the breakdown voltage, the avalanche will be quenched; the pixel will then recover to the initial state and be ready for a new avalanche process. The device usually has a surface size of several mm<sup>2</sup> and the pixel to pixel distance (pitch) is normally tens of microns. Figure [2.5](#page-28-1) shows a picture of a typical SiPM production from Hamamatsu, Japan [\[Ham,](#page-114-0) [She\]](#page-114-1).



<span id="page-25-1"></span>Figure 2.1: Sketch of a SiPM pixel array. Each pixel consists of an APD and a quenching resistor.

The most important characteristics of the SiPMs: the gain, dark count noise, crosstalk, afterpulsing and the photon detection efficiency (PDE) are described below.

### <span id="page-25-0"></span>2.1.1 Gain

Due to the nature of the Geiger mode avalanche, each single pixel can be used as a binary photo counter [\[She\]](#page-114-1). The output of the pixel is the same and independent of the number of absorbed photons. Since the pixels are connected in parallel, the SiPM



Figure 2.2: Photo of Hamamatsu MPPC [\[Ham\]](#page-114-0).

detector can be used as a photon counting device. Figure [2.3a](#page-26-0) shows an oscilloscope snapshot of the SiPM output waveform. The displayed waveforms correspond to signals of one, two, three and more pixels fired at the same time. If this output signal charge is integrated, it should yield a charge spectrum like one shown in Figure [2.3b](#page-26-0), where the x-axis shows the amount of charge or the number of photons detected and the y-axis is the number of recorded events. The high gain of  $\sim 10^6$  is determined by the amplification factor of the individual pixel. The gain increase with larger pixel dimensions due to larger pixel capacitance. The gain is calculated from the distance between adjacent photoelectron peaks.



(a) Oscilloscope snapshot of the SiPM output waveforms for different number of fired pixels  $[Ham].$  $[Ham].$ 

(b) Tipical SiPM photon spectrum [\[Har\]](#page-115-1).

<span id="page-26-0"></span>Figure 2.3: SiPM response to low intense light.

### <span id="page-27-0"></span>2.1.2 Dark Count Noise

Apart from photon absorption, electron hole pairs can also trigger Geiger pulses by thermal excitation. This process is indistinguishable from the signal generated by photon absorption and occur at any time and any conditions. Such events are called dark noise events and the dark count rate (DCR) is a key characteristic of Silicon Photomultipliers. The probability for thermal excitation rises exponentially with the temperature of the sensor [\[Vac\]](#page-115-6). Cooling the sensor to lower temperatures can therefore significantly decrease the DCR. The thermal excitation of electrons is additionally increased by impurities in the crystal lattice, which add additional energy states in the band gap. Current SiPM models from Hamamatsu achieve a DCR of less than  $100 \text{ kHz/mm}^2$  at a temperature of  $25 \degree C$  [\[Dat\]](#page-115-7).

#### <span id="page-27-1"></span>2.1.3 Cross-talk

Cross-talk can occur if photons generated during an avalanche breakdown escape and are detected in a neighbouring pixel as sketched in Figure [2.4.](#page-27-3) The emission and absorption of the photons happen on a short time scale of less than 1 ps and hence it is impossible to discern cross-talk events from the absorption of two separate photons of a light pulse. A higher over-voltage leads to a higher sensor gain and the generation of more charge carriers contributing to the avalanche. This results, in an increased number of emitted photons from the pixel, and also to increased cross-talk probability. Larger pixel sizes increase the chance that an emitted photon is again absorbed within the same pixel, reducing the cross-talk effect. By adding deep trenches between the pixels, the emission of photons to neighbouring pixels can be further inhibited. As with the dynamic range, the additional space required for trenches reduces the fill factor of the sensor, and by this the photon detection efficiency [\[Har\]](#page-115-1).



<span id="page-27-3"></span>Figure 2.4: Sketch of direct optical cross-talk in SiPM.

### <span id="page-27-2"></span>2.1.4 Afterpulsing

Afterpulsing is another drawback of SiPMs and usually refers to secondary avalanche process after the bias voltage of a pixel has sufficiently recovered for another breakdown. Afterpulsing is produced by two mechanisms: the first mechanism takes place when the emitted photons during the initial avalanche process can be absorbed in the intrinsic material of the Geiger-mode APD and generate an electron hole pair in this region. If the diode voltage is above the breakdown voltage when the electron arrives after the drift time in the multiplication region, a secondary avalanche process is triggered. The second mechanism is due to the temporary trapping of charge carriers in trapping centers generated by impurities in the silicon substrate. The lifetime of such a trapped state can be longer than the recovery time of the pixel. Once the electron is released again, it can re-trigger the avalanche process resulting in the observed afterpulse. An example of an afterpulse signal is shown in Figure [2.5.](#page-28-1) Although they are also indistinguishable from a real photon signal, their properties can be investigated by studying the timing duration of two successive dark noise pulses.



<span id="page-28-1"></span>Figure 2.5: *Example of a SiPM after pulse signal [\[Ham\]](#page-114-0)*.

### <span id="page-28-0"></span>2.1.5 Photon Detection Efficiency (PDE)

Photon detection efficiency is the probability for a photon to trigger an avalanche process. It determines the loss of photons which are not seen by the whole photon detection system. The PDE is the product of three factors

$$
PDE = \epsilon_{gm} \cdot Q_E \cdot P_{tr} \tag{2.1}
$$

where  $\epsilon_{qm}$  is called *filling factor*, it is determined by the ratio between the area covered by sensitive pixels and the total surface area of the sensor. Quenching resistors and conducting metal also need space on the detector area so  $\epsilon_{qm}$  is always smaller than one.  $Q_E$  is called *quantum efficiency* and is defined as the probability of carrier generation for incoming photons.  $P_{tr}$  is the *triggering probability* and refers to the probability that a created carrier will trigger an avalanche process. The fill factor increase with large pixel size. The detection efficiency differs significantly between the SiPM models and the typical values are in the range of 10% to 40% [\[Eck,](#page-114-2) [She\]](#page-114-1).

### <span id="page-29-0"></span>2.1.6 Temperature Dependence

The temperature of a SiPM has a strong influence on many parameters of the pnjunction in the APD cells, among others the width of the depletion zone, the recovery time, the resistivity of the semiconductor material, and first and most important the breakdown voltage, which typically changes linearly by roughly 50 mV/K. Therefore, the stable operation of a detector system requires the stabilization of the sensor temperature at the level of 1 K. If such a control is not feasible, the temperature has to be monitored and the bias voltage  $V_{bias}$  adjusted such that so the gain stays constant.

### <span id="page-29-1"></span>2.1.7 Timing Dependence

In many applications the measurement of the arrival time of incident photons is very important and especially in the case of building a Time-of-Flight PET. Typically a discriminator circuit is used to generate a trigger signal when a certain threshold level is over-passed by the sensor output. The precision of this trigger signal is determined mainly by the noise of the system, the slope of the sensor signal and the jitter of the signal as shown in Figure [2.6](#page-29-2)



<span id="page-29-2"></span>Figure 2.6: Timing errors contribution to the trigger signal *[\[Har\]](#page-115-1)*.

Figure [2.6a](#page-29-2) shows the time-walk effect which is the shift in the trigger timing signal due to different signal amplitudes. This shift is proportional to the signal amplitude so it can be corrected by calibration measurements.

The Figure [2.6b](#page-29-2) shows the timing jitter effect. The produced  $\Delta t$  is a contribution from the noise of the detector signal and the readout electronics. This noise source has an random distribution so the timing jitter cannot be corrected but can be reduced. A fast signal rise-time has less jitter.

The pileup effect, shown in [2.6c](#page-29-2), also affects the timing of the trigger. It produces a baseline shift and it is especially pronounced in a device with high Dark Count Rate (DCR). New silicon sensors with lower DCR reduce the pileup effect and also cooling the detector to lower working temperatures.

The following sections describe the front-end electronics used to coupe with the SiPMs signals.

### <span id="page-30-0"></span>2.2 Front-end electronics

The front-end electronics need to be an Application Specific Integrated Circuit (ASIC) due to the concentrated number of channels to be readout in the tiny area. As it was specified in the previous chapter the chip has to be design for energy and timing measurement with silicon photomultiplier detectors. At the starting point of the project only the Vata64hdr16 from Gamma-Medica IDEAS [\[Ide\]](#page-114-3) was suited for the requirements of the Petete scanner. There was also the SPIROC chip from Omega micro [\[Lal\]](#page-116-0), which coupe with SiPMs but only accepts positive signals. This limits the selection of the photon-detectors for the scanner, therefore SPIROC was rejected as an option.

By the end of the development of the DAQ board a new ASIC (STiC2) became available [\[Sti\]](#page-116-1). The ASIC is designed for SiPM readout for TOFPET applications, it has only 16 readout channels. The 64 channels version (STiC3) was planned to be developed and became available at the end of 2014. For its evaluation and test purposes four connectors for the STiC chip are included into the main DAQ board. This makes the board suitable to work with both ASICs (Vata64hdr16 and STiC3),and allows future upgrades without the need of a new board development. A description of the Vata64hdr16 chip is presented in the following sections and the working principles of the STiC chip are described in the next chapter.

### <span id="page-30-1"></span>2.2.1 VATA64HDR16 chip

The VATA64HDR16 is a 64-channel front-end ASIC for energy measuring with SiPM detectors. The ASIC glued and bonded on a test board can be seen in Figure [2.7.](#page-31-0) The analog signal processing takes place in the front-end channels of the device, while the readout is handled in the back-end. The architecture of the channel readout is shown in Figure [2.8.](#page-31-1) Each channel consist of charge sensitive preamplifier followed by to two separated branches: an energy processing and a trigger processing branch. The trigger

### 20 CHAPTER 2. PHOTON DETECTORS AND FRONT-END ELECTRONICS

branch has a fast shaper with shaping time of 50 ns aligned with a discriminator with user variable threshold. The chip integrates a Time to Analog Converter (TAC) which gives the time differences between the first triggering channel and the other channels that have been hit. This feature is not of interest for the Petete scanner.

The energy branch consist of a slow shaper with shaping time of 100-200 ns, to provide a charge measurement, later followed by a peak hold device. The peak hold device can be bypassed by setting low a slow control configuration bit. The peak hold device keeps the peak of the measured signal. Afterwards a sample and hold is applied at the peak of the signal.

After a trigger event in one of the channels, all 64 channels will integrate any deposited charge in parallel. The outputs of the channels are readout via two multiplexers running in parallel, one for the sampled slow shaper charge measurement and one for the TAC values. The output of both multiplexers is available via differential output current buffers. The energy measurement can be readout with a voltage output buffer. This can be configured trough a slow control configuration register.



Figure 2.7: Vata64hdr16 ASIC bonded on a test board.

<span id="page-31-0"></span>

<span id="page-31-1"></span>Figure 2.8: A block diagram of the ASIC channel [\[Ide\]](#page-114-3).

VATA64HDR16 is analog-in-analog-out chip. This means that its output is analog and an ADC must be used in the following stage of the electronics. The chip provides only energy information and trigger of the input signal. Thus, to measure the time of the arrival of the particle an external TDC have to be used. Additionally, the chip works with non standard  $\pm 2.5V$  digital signals, which means that more external additional components are required to adapt the differences in the digital signals, because the FPGA usually works only with positive signals. All this makes the design of the electronics more complex, however the availability of a chip for SiPMs with 64 readout channels is still advantageous from the point of view of the integration.

### <span id="page-32-0"></span>2.2.2 Front-end board

Having a separated board for the ASIC and the detectors allows a modular design, decoupling the front-end and the back-end part. It offers the chance to connect to the DAQ hybrids with different topologies, e.g. a hybrid with several VATA64HDR16 connected in daisy chain in order to read larger detectors or use a hybrid with different type of detectors. In this way the setup is more versatile and easy to update in the future.

For test purposes a hybrid board has been designed. The board holds the detectors and the VATA64HDR16 ASIC. The detectors are SiPM arrays coupled to scintillator crystals. The four SiPM matrices are composed of 16 (4x4) pixel elements each. A picture of the board is shown in Figure [2.9.](#page-32-1) The high voltage is connected directly to the hybrid board and a filter is placed for each pixel separately.

<span id="page-32-1"></span>

Figure 2.9: Hybrid board housing the ASIC Vata64hdr16 and 4 SiPM matrices.

## <span id="page-34-0"></span>Chapter 3

# STiC ASIC

The Silicon Photomultiplier Timing Chip (STiC) is a mixed mode ASIC developed for Time-of-Flight applications in HEP and medical physics. The third generation of the chip has 64 readout channels and has been developed in the UMC 0.18  $\mu$ m CMOS process. It consist of three main blocks: an analog amplifying stage, a TDC and a digital part. In this chapter the ASIC design concept and the description of the signal processing methods are presented. This work focuses more into the digital block where the main contribution is done, however a global overview of the chip is also presented.

### <span id="page-34-1"></span>3.1 Working principle

The ASIC implements 64 high-timing-resolution readout channels with the possibilities of differential or single-ended connection to the MPPC. The differential analogue frontend has been optimized to improve the timing performance of the chip. A dedicated digital to analogue converter (DAC) controls the voltage at the input terminal of each channel for compensation of temperature fluctuation and the differences in the individual SiPM breakdown voltage. Each channel comprises an analogue part, a TDC and a digital part as shown in Figure [3.1.](#page-35-1) The timing and energy information are encoded in two time stamps obtained by discriminating the signal with two different thresholds. A very low threshold discriminator is used as timing trigger. A second discriminator with a higher threshold is used for the energy trigger. The requirement that the event has to pass the energy threshold to be recorded rejects a large amount of noise events, reducing the rate of generated events by the channel. A special linearized time-over-threshold method is implemented to provide a linear relationship between the signal charge and the measured signal width. The two time stamps are processed by the integrated high precision TDC with a time binning of 50 ps. The TDC data is further processed by the digital part of the chip where the event data is built including the ID number of fired channels as well as the time and the energy information per event. The events are stored in an implemented, 128-word deep, FIFO memory. Later the data is encoded using 8/10-Bit encoding and transmitted via a 160Mbit/s LVDS serial link to the DAQ system. A serial peripheral interface (SPI) is used to program

the chip. The configuration word includes all DAC settings, different bias voltages, activation of a debug mode of operation and other configuration parameters of the ASIC.



<span id="page-35-1"></span>Figure 3.1: Block diagram of one STiC channel.

### <span id="page-35-0"></span>3.2 Analog stage

Using the chip with differential connection to the SiPM adds additional capacitance at the input terminal, leading to increased statistical noise in the readout. From the other side the differential signal connection add the benefit of improved common mode noise rejection, influencing the timing performance of the chip. The analog input stage uses a common gate amplifier stage to provide a high bandwidth and a low input impedance for the readout of the charge pulse coming from the SiPM. The gate voltage of the input transistor is controlled by an internal DAC, which can tune the SiPM bias voltage in a range of more than 0.5V. The signal is further processed using three subsequent high bandwidth differential amplification stages. Then, a comparator circuit with intrinsic hysteresis generates the timing trigger. The hysteresis prevents multiple triggering on noise during the slow decay of the SiPM signal.

The energy threshold is generated by an additional comparator, which uses only the positive or the negative part of the differential output signal from the input stage. An implemented switch in front of the comparator selects the desired polarity. The threshold voltage of the comparators and the switch settings are configured by the digital control logic.

The rising edge of the timing rigger and the falling edge of the energy trigger have to be merged into a single output signal which contains the processed timing and charge information of the SiPM current pulse. The time of the two edges is measured in a common TDC channel. A simplified block diagram if the logic circuit and the signals flow is shown in Figure [3.2.](#page-36-1) The TDC channel is sensitive only to the rising edges, therefore the two edges of the trigger have to be combined into a signal with two separate rising edges. This is done by an XOR combination of the two trigger signals,


Figure 3.2: Trigger merging implemented in STiC [\[Har\]](#page-115-0).

resulting in two successive pulses starting at the rising edge of the time trigger (T-*Trigger*) and the falling edge of the energy trigger  $(E\text{-}Triager)$ . To ensure a minimal width of the first pulse, the rising edge of the energy trigger is combined with delayed copy of the energy trigger by a logic AND gate. This will delay the rising edge, but will maintain the time of the falling edge of the signal.

## 3.3 TDC

The TDC architecture is based on a combination of a coarse and fine counters. The coarse counter uses a low frequency and gives typically a nano-second bins. The slow frequency allows the use of a simple counter module covering a large measurement range. The coarse bins are further subdivided in pico-second bins by the fine counter. The TDC module implemented in STiC chip has been developed at ZITI Heidelberg and used in the PETA ASICs [\[Fis\]](#page-117-0). It uses a digital interpolation method based on a Phase Locked Loop (PLL) and has been designed using the differential Current Mode Logic standard. A detailed description of the TDC module is presented in [\[Rit\]](#page-117-1).

The TDC is divided in two main parts: the Timebase unit and the TDC channel. The coarse and fine counter values are generated in the Timebase unit, and later distributed to the TDC channels. The TDC channels contain fast latches to store the current timestamp value. A block diagram of the TDC is shown in Figure [3.3.](#page-37-0) A Voltage Controlled Oscillator (VCO) consisting of 16 delay elements with propagation delay  $\tau_d$  can be tuned by a reference voltage. The reference voltage is controlled by

a PLL unit. The signal propagating through the cells is inverted at the last element and feed back to the firs cell. In this way the coarse counter clock is subdivided in 32 possible states (16 ones and 16 zeros).



<span id="page-37-0"></span>Figure 3.3: Block diagram of the TDC module implemented in STiC3.

The period of the VCO clock has to be looked to an external reference clock. To match the VCO frequency to the reference clock a phase detector unit is used. This unit evaluates the phase difference of the two clocks and creates a control signal to adjust the delay of the delay cells. Once the PLL is locked, a small deviations of the VCO frequency will be automatically corrected. This will prevent a variation in the clock frequency due to temperature or power variations. This also allows to synchronize multiple PLLs to an external reference clock, providing a common time reference for different ASICs. The PLL has been designed to work at a frequency of 625MHz. Then, the average delay of the VCO cells in the locked state can be calculated by  $1/(32x625MHz) = 50$ ps.

The coarse counter is driven by the VCO clock output. It is implemented as a 15-bit linear feedback shift register (LFSR), or it has a 215-1 binary states. The counter values can be unstable due to the short time interval during which the next counter value is stored. To solve this problem the counter value is sampled twice in two registers called Master and Slave. One of them samples the clock at the rising edge and the other at the falling edge. Thus, at least one stable counter value is guaranteed. The selection of the Master or Slave counter value is done later in the digital signal processing and it is based on the fine counter value. The coarse values are reseted later by a reset signal from the digital logic. The full timestamp consists of 16 bits corresponding to the VCO stage and 30 bits of the coarse counter, Master and Slave registers. For each new recorded timestamp a Data Ready signal is generated. After processing the data from the digital logic, a reset signal is returned to the TDC channel. Then, the Data Ready signal and the latches are cleared. After a short recovery time of around 30ns, the channel is ready to record the next timestamp.



<span id="page-38-0"></span>Figure 3.4: Block diagram of the digital part of STiC3.

# 3.4 Digital stage

Figure [3.4](#page-38-0) shows a block diagram of the chip. The digital core logic of STiC3 is arranged in four groups of 16 readout channels. Each group share a common Level1 (L1) FIFO memory with 64 words deep capacity. The 4 bits channel number is appended to the event data before it is written to the memory. The expected event rate for PET detectors system is around 10kHz per channel. A simple priority arbitration based on the channel number is implemented. It is sufficient to control the write access of the channels to the L1 buffer with a 160Mhz clock frequency of the digital core logic. In case of a noisy channel, it can be disabled from the data taking by the slow control configuration register.

The events stored in the four L1 buffers are processed by the *Priority select unit* and the Master or Slave select unit before being stored in the Level2 (L2) buffer with up to 128 events capacity. Again, the 2 bits group number is appended to the event data. The resulting channel number contains 6 bits ranging from 0 to 63. Since each L1 group concentrates the data rates of 16 channels, the write access to the L2 buffer is arbitrated using a fair priority algorithm explained in detail within the next section.

The 47 bits data stored in the L2 buffer is processed further in the Frame generator unit. There the data is included in a frame as shown in Figure [3.5.](#page-39-0) The frame starts with a 8-bit Header ID, followed by 8 bits corresponding to the Frame number, and finished with the Trailer ID and the number of the hits included into the frame. The Header ID and the Trailer ID are fixed numbers. Once the frame is constructed, the next process is to encode the data using the 8b/10b encoding [\[Alb\]](#page-117-2). This encoding maps the existing byte symbols into 10 bit symbols. This method increases the number of transmitted bits, but generates a DC balanced signal distributing equally '0' and '1' symbols, and provides also enough state changes to allow the clock recovery from the data stream. The encoding also is used as a way to detect errors in the data transmission from the DAQ system, since invalid 10 bit symbols can be recognized as wrong data. Once the data is encoded, it will be transmitted via a 160Mhz serial link to the PC.

| $\overline{\phantom{a}}$<br>Header ID<br>Frame # | Event Data | ID<br>Trailer. | # Hits |
|--------------------------------------------------|------------|----------------|--------|
|--------------------------------------------------|------------|----------------|--------|

<span id="page-39-0"></span>Figure 3.5: Data frame format.

In order to monitor the operation of the digital logic, a debug module was implemented. This module shares the SPI signal lines but has a separate Chip-Select signal. A finite state machine evaluates a command field in the received data from the SPI interface to perform different debugging actions. It allows to monitor the state of several important logic signals in the digital part, such as the status signal of the FIFO buffers. The debug module also allows to write event data to the L1 buffers, which can be used to test the data processing chain and verify the serial data transmission.

## 3.4.1 Priority select unit

The main idea of the Priority select unit is to give equal priority to the four memory groups. If the standard round check is used, then at least 4 clock cycles are needed to check all the groups. In the proposed design only 2 clock cycles are enough to point the next group with available data. For this purpose a finite state machine was implemented, being its stages sketched in Figure [3.6.](#page-40-0) The machine stays in Idle state until at least one of the FIFO buffers flag shows that there is data available. The next state identifies who (which module) indeed has data, and rises the corresponding read enable flag for the L1 buffer within the next state. During this process a register saves which group has been read. Within the next iteration in this state the group read in the previous iteration will be ignored (even having data available) and, in case another group has data, the pointer will jump to this group to be read. In case none of the other groups have data, but the same group has data, then it will be enabled again for reading. In case none of the groups have data, the state machine will return to its initial state.

Figure [3.7](#page-40-1) shows the print screen of the simulation output diagram of the priority select unit. The data for the four FIFO buffers is a constant value, and only the empty flag is used as a stimulus to trigger the state machine. Three scenarios are considered to test the state machine: 1)two consecutive buffers have data, 2)two not consecutive buffers have data and 3)all has data at the same time. The simulation shows that only two clock cycles are required to activate the read enable flag of the corresponding buffer, independently of the number of the buffer with data. In case that all the buffers contain data, then they will be read consecutively, giving equal priority to all buffers.

The principle of this priority select unit worked as expected, but for the final implementation more clock cycles were added to read a single group. The reason is that the L2 FIFO can not respond so fast to the write condition and at least 2 more clock



<span id="page-40-0"></span>Figure 3.6: Block diagram of the finite state machine used for the priority select block



<span id="page-40-1"></span>Figure 3.7: Print screen of the simulation of the priority select block.

cycles are needed. However, the priority selection algorithm to decide which group must be read is maintained and implemented in the current version of the STiC3 chip.

#### 3.4.2 Master or Slave select unit

Before the event data is written to the second FIFO buffer stage, the fine counter values of the T-Trigger and E-Trigger timestamps are used to select the correct coarse counter value. The selection criteria is programmable by a parameter of the chip configuration, which specifies the fine counter ranges in which the Master or Slave counter values are selected. For debugging purposes, a test mode has been implemented in which the same event is written twice to the FIFO, once with the Master and once with the Slave coarse counter value. The removal of the redundant data reduces the size of the events to 48 bits. The final data format of the events which are stored in the Level 2 FIFO buffer and transmitted to the DAQ system is shown in Table [5.2,](#page-85-0) section [5.2.4.](#page-84-0)

# 3.5 Backend design

The physical implementation of the chip is done in the UMC  $0.18\mu$ m technology on a die of  $5 \times 5$  mm<sup>2</sup>. The floorplane of the ASIC is done with *Encounter* (from *Cadence*) and a print screen of the layout is shown in Figure [3.8.](#page-41-0) The digital core is located in the center of the chip, between the readout channels distributed symmetrically on the left and right sides. The Faraday standard cell library [\[Far\]](#page-117-3) has been used for the implementation of the digital logic. The Level 1 FIFO buffers is placed close to the associated group and the Level 2 event buffer is situated in the center of the digital part. The digital IO cells are placed in the center on the top and bottom side of the ASIC to have less disturbances with the analog IO. The analog IO cell on the left and right side use the staggered placement option to match within the die size and the pad width.



<span id="page-41-0"></span>Figure 3.8: Print screen of the layout of the chip

Figure [3.9](#page-42-0) shows the STiC chip, bonded on a custom designed cavity board. The chip is glued inside the cavity and the bonding is done in two levels. This method omit the need of vias and facilitates the signal routing and the impedance matching to preserve the signal integrity of the differential lines.

The advantage of the STIC chip is that its output is fully digital, and the TDC is already integrated inside the chip. In this way the obtained integration is very high

<span id="page-42-0"></span>

Figure 3.9: STiC chip bonded on a cavity board

because less external components are needed. The only drawback is elevated power consumption, which requires the design of appropiate cooling.

# Chapter 4

# Main acquisition board

This chapter describes the design flow of the main acquisition board. First of all, a general overview of the board and its different section is presented. Then follow the description of the ASICs signals, which are the demanding inputs/outputs signals for the DAQ board. The different hardware blocks are explained in details with their design arguments.

# 4.1 General view of the acquisition board

The developed acquisition board, so called HDRDAQ, has been designed to manage the readout process of the full-ring small animal PET scanner prototype Petete. For this purpose four boards have to work synchronously to cover the 16 detector heads and each board can handle 4 hybrid boards. The board main function are to carry out the control of the Vata64hdr16 ASIC, to process the analog data, to control the acquisition process and to carry out the communication with PC. Figure [4.1](#page-45-0) shows the block diagram of the board and its main functional units.

The HDRDAQ board is remotely controlled by the user through a user-friendly software program installed on a PC. The program generates specific commands/words that indicate to the FPGA the desired action and the target hybrid board. The FPGA recognizes the command and generates the necessary signals towards the peripheral components to complete the specified function. In this way the system can be configured in terms of thresholds levels, bias voltages, operation mode and others properties which are explained in detail in the following sections of this chapter.

For the communication process with the PC a LogiCORE IP Tri-Mode Ethernet Media Access Controller (TEMAC from Xilinx) core is implemented into the FPGA. This core is connected to an Ethernet physical layer (PHY) transceiver which supports full-duplex operation at selectable 1Mbps, 100Mbps, and 1Gbps. The User Datagram Protocol (UDP) is used for the data transfer.



<span id="page-45-0"></span>Figure 4.1: Main functional blocks of the HDRDAQ board.

The ASIC has three analog outputs, but only two of them can work in parallel. This means that two analog-to-digital converters are needed per DB and one of them is shared between two analog outputs of the ASIC. In the DAQ board the analog signals are passed through two differential amplifiers and then through a passive filter. An octal, 12-bit ADC with pipelined architecture at 40MSPS digitizes the analog data and sends it to the FPGA chip in LVDS format.

The **Bias control block** generate an analog voltage levels (biases) needed form the ASICs for their correct functionality. The acquisition board has been designed to be able to work with different levels of the bias voltages and polarities. Analog switches are implemented, controlled by the PFGA chip, to select which polarity to connect to the output pin.

The Vata64HDR16 generates a current logic trigger signal when hit is detected. This signal is converted to LVDS signal and send to the FPGA. This is reflected in the Trigger block on the schematic picture of the board. This block also create a copy of the trigger signal and pass it to a connector (sync. signal connector ) for further sharing of the signal with the other HDRDAQ boards. The sync. signal connector block also include a LVDS signals connected directly to the FPGA. The idea is to be used for the synchronization between the different HDRDAQ boards.

The **Slow control** block and the **Readout control** block implement the interface of the signals for the slow control register of the ASIC and the control signals during the process of acquisition, respectively.

Calibration options are included in the Vata64hdr16 chip with the possibility of testing each readout channel individually. The channels selected for testing are sensitive to analog test-signals injected at the dedicated input pin. For easier ASICs characterization, a test signal can be provided from HDRDAQ board. This pulse is generated internally by the system or can be introduced externally via an input connector. In this case an external pulse generator has to be used.

The STiC ASIC is a relatively new chip, the version with 64 channels is available from the end of the 2014. The previous version of the chip had only 16 channels and that is the reason for excluding it as an option to develop the HDRDAQ board only for this chip. So far there is an interest from the group at IFIC to test the performances of the chip. Using the available space and resource from the acquisition board (like power supply, FPGA resources and communication block) a connector to the STiC chip has been included. The chip needs a specific oscillator of 640Mhz and drivers to split and distribute fast clock signal to 4 chips. These components are reflected in the scheme in Figure [4.1](#page-45-0) as STiC interface block.

Output connectors are included on the board for monitoring of the acquisition process with a scope. Those outputs are free and are defined by the firmware according to the user needs.

The HDRDAQ has its own power supply block based on DC-DC modules and linear regulators, providing the required power voltages from a single input voltage (6V) that can be supplied by a desktop adapter. Taking into account that the hybrid connected to the HDRDAQ can be set-up with a different number of ASICs (from 1 to 16) and also with different types (of the VATA family) with similar but not equal voltages levels, a separate power supply is required. For that reason a separate power-in connector is included on the DAQ board.

A flash memory is in charge for the configuration of the FPGA during each power up of the board.

In general the design of the HDRDAQ board has been guided by the concept of being an acquisition board as general as possible. That means easy upgraded in future for new front-end electronics or new detectors type. Maybe some modification of the firmware will be needed but no hardware changes have to be made, which are normally the most time consumable and expensive ones.

# 4.2 Vata64hdr16 signals specifications

For the design of the HDRDAQ board the first thing to do was to study the signals between the ASIC and the DAQ board. These signals are summarized by their type, direction and functional group in Table [4.1.](#page-47-0)

The first group embraces the signals for the slow control register configuration of the ASIC. It consists of the signals: *clkin, regin, cs\_en, regout* and *load*. They are single ended digital signals with logic levels of  $+2.5V$  equal to logic  $1'$  and  $-2.5V$  equals to logic '0'. These signals are used to send the 872 bit long configuration register of the Vata64hdr16 ASIC.

| Group                       | Name             | <b>Type</b>              | <b>Dirrection</b>       |
|-----------------------------|------------------|--------------------------|-------------------------|
| <b>Slow control signals</b> | sc en            | logical                  | $FPGA \implies ASIC$    |
|                             | load             | logical                  | $FPGA \implies ASIC$    |
|                             | regin            | logical                  | $FPGA \implies ASIC$    |
|                             | regout           | logical                  | $ASIC \Rightarrow FPGA$ |
|                             | clkin            | logical                  | $FPGA \implies ASIC$    |
| <b>Readout signals</b>      | hold             | logical                  | $FPGA \implies ASIC$    |
|                             | dreset           | logical                  | $FPGA \implies ASIC$    |
|                             | shift in b       | logical                  | $FPGA \implies ASIC$    |
|                             | shift out b      | logical                  | $ASIC \implies FPGA$    |
|                             | ckb              | logical                  | $FPGA \Rightarrow ASIC$ |
|                             | maresp/maresm    | low voltage differential | $FPGA \implies ASIC$    |
|                             | ta/tb            | current mode logic       | $ASIC \implies FPGA$    |
| <b>Bias signals</b>         | vfp              | analog                   | $FPGA \implies ASIC$    |
|                             | vfsf             | analog                   | $FPGA \implies ASIC$    |
|                             | vthr             | analog                   | $FPGA \implies ASIC$    |
|                             | mbias            | analog                   | $FPGA \implies ASIC$    |
|                             | other signals    | analog                   | $FPGA \implies ASIC$    |
| Data signals                | voutp/voutm      | differential voltage     | $ASIC \implies FPGA$    |
|                             | outp/outm        | differential current     | $ASIC$ => $FPGA$        |
|                             | outp2/outm2      | differential current     | $ASIC$ => $FPGA$        |
| <b>Test signal</b>          | cali             | analog                   | $ASIC$ => $FPGA$        |
| <b>Power supply</b>         | A <sub>VSS</sub> | power                    |                         |
|                             | AVDD             | power                    |                         |
|                             | <b>GND</b>       | power                    |                         |

<span id="page-47-0"></span>Table 4.1: List of the signals between the HDRDAQ board and the Vata64hdr16 chip.

The second group are the readout signals composed by digital in/out signals. The signals:  $shift\_out\_b$ ,  $shift\_in\_b$ ,  $hold$ ,  $dreset$  and  $ckb$  are again digital 'logical' signals. The trigger signal  $ta/tab$  is a current mode logic differential type one, generated from the ASIC. There is one more signal : maresp/maresm; this is the reset signal for the peak and hold device and TAC unit inside the chip. It is generated from the FPGA and later has to be converted to low voltage differential type and sent to the ASIC at the end of each readout sequence, but only when peak and hold device or TAC unit are used.

The third group is composed of the so called *'Bias'* signals. They are all analog input signals for the ASIC and they can be currents or voltages type, positive and negative. With these voltage levels can be influenced to the ASIC gain, threshold and behavior. Only the most important bias signals are mentioned in the Table [4.1.](#page-47-0)

The fourth group is the analog data output from the ASIC. These are the signals voutp/voutm, outp/outm and outp2/outm2. They contain the charge information of the received hit signal.

The ASIC power supply is  $\pm 2.5V$  which determinate the logic level of its digital signals. The FPGA works only with positive digital signals so the conversion of the signals is needed. The ASIC works with many analog signals and that require the use of Analog to Digital Converters (ADCs) and Digital to Analog converters (DACs). All type of conversions implemented on the DAQ board are described in the following section.

# 4.3 Analog Part

The analog part refers to the analog amplifier, the ADC and the bias signals. From each hybrid board or ASIC there are three differential analog output signals: the *vout*, out, and out2. In normal operation of the chip only the vout or out output is used. The main difference in the output signal type is if voltage or current output is desired. Because the chip behaves differently with the different output, for example a higher gain can be achieved with the voltage output, the two type of output are considered to be used but not at the same time. Octal ADC converts the amplified data from the four hybrid boards. To fit the number of available ADC and output signals the vout and *out* output share the same ADC channel. A switch is included to select which input is connected to the ADC. Further the analog amplification stage is explained in details.

## 4.3.1 Analog Input Signals

The generated charge into a pixel of the SiPM is passed to the readout channel of the ASIC. The ASIC dynamic range is up to  $\pm 55pC$  (voltage output) which means that it will works with very small current signals. The measured charge is available at the ASIC differential analog output channel and has range within a few mV for the voltage output and 0 to  $200\mu\text{A}$  for the current output. The pulse amplitude is proportional to the measured charge.

After a trigger event and after the integration time of the slow shaper of the ASIC the analogue value of *channel 0* is presented at *outp/outm.* For each clock (gck), the shift bit is clocked to the next channel. After 64 clocks the 64 channel analog values will be available at the output buffer. The amplitude of each channel has to be sampled with ADC and its value stored temporally in a buffer in the FPGA. The gck clock supports frequencies from kHz to 10Mhz. This will condition the choice of the ADC sampling rate.



<span id="page-49-0"></span>Figure 4.2: Analog output signal in scope mode. Channel 4 (green) is the test pulse and channel 2 (blue) is the analog output signal (outp).

Another readout mode is the *scope* mode. In this case the ASIC is configured in 'test  $mode'$ , that means only one channel is connected to the test input. With the  $qck$  clock signal the output shift register is set to points to the same channel output. If a test pulse is applied to the test input or to the input of that channel, the shaped analog output signal can be observed at the output of the ASIC. It has a Gaussian shape and it is shown in the Figure [4.2.](#page-49-0) The channel 4 of the oscilloscope shows the test pulse and the channel 2 the analog output  $(outp)$  signal measured with a single ended scope probe. Two pulses are formed of the test pulse, one of the rising edge and one of the falling edge. The negative pulse is ignored since the ASIC is configured for positive signals. In this mode, the shape of the signal can be reconstructed and for thus a fast sampling rate of the ADC is required. The scope mode is mainly used to adjust the right time to apply the hold signal so the peak of the pulse is sampled correctly.

#### 4.3.2 Amplifiers and filters

The amplifier chosen for this application is the AD8139 from Analog devices [\[AD8\]](#page-116-0). The AD8139 is a high speed, low noise differential amplifier fabricated from Analog Devices. It is designed to provide two balanced differential outputs in response to either differential or single-ended input signals. Differential gain is set by external resistors and the common-mode level of the output voltage is set by a voltage at the  $V_{OCM}$  pin and is independent of the input common-mode voltage. It has also rail-to-rail output providing maximum dynamic output range. Figure [4.3](#page-50-0) shows a schematic of the analog part of the HDRDAQ board. The signals in this case,  $TACp/TACn$ , pass through an amplifying stages and optional filtering stage. The amplifier output common-mode level is set by a voltage divider circuit composed of two 10KΩ resistors connected to ±5V. The first stage inverts the signals polarity but this happens again before the ADC so the polarity of the converted signal is the same as the input signal. The gain of the amplifier in case of the current signal is defined by the equation:

$$
V_{out1} = -R_f.I \tag{4.1}
$$

where the Rf is the feedback resistor. The ASIC output current goes from 0 to 200  $\mu$ A and the Rf = 1 K $\Omega$ , so after the amplification stage the Vout = 0.2V. In case the the input is a voltage (for *voutp/voutm* signals) the gain is defined by the equation:

$$
G_2 = -\frac{R_f}{R_s} \tag{4.2}
$$

where  $\text{Rs} = 200 \Omega$  and  $\text{Rf} = 1 \text{ K}\Omega$ , what make the Gain = 5. Then the output voltage can be calculated as follows:

$$
V_{out} = V_{out1}.G_2 \tag{4.3}
$$

For this close-loop gain configuration the total noise of the output differential voltage For this close-loop gain configuration the total noise of the output differential voltage noise is  $19.7 \text{nV} \sqrt{Hz}$  and 53MHz bandwidth at 3dB, according to the data sheet of amplifier.



<span id="page-50-0"></span>Figure 4.3: Schematic of one analog channel. The analog signals, TACn/TACp pass through an amplifying stages follow by passive filter before the ADC.

The amplification stages are followed by an optional passive filter stage. A ferrite chip bead is located for Electromagnetic Interference (EMI) suppression filtering (BLM18BA100SN1D). Figure [4.4](#page-51-0) shows the ferrite bead impedance characteristic. The

ferrite starts to have high sharp impedance for signals with frequencies higher than 30 MHz, this filter the high frequency noise and keep the low impedance for the low frequency signal.



<span id="page-51-0"></span>Figure 4.4: Ferrite characteristics

## 4.3.3 ADC block

There is wide variety of different ADCs in the market. This makes it difficult to choose the right one for the specific application. In the case of HDRDAQ board the following factors are considered:

- $\Diamond$  The ASIC maximum required sampling rate is 10MSPS (Mega samples per second).
- $\Diamond$  In *scope* mode a reconstruction of the pulse shape is need so with faster sampling rate a better shape reconstruction can be obtained.
- $\Diamond$  Possibility to keep the present DAQ board actual for the future faster versions of the ASIC or other new type of front-end electronics.
- $\Diamond$  The ASIC analog output is a differential signal and the ADC has to accept differential input signals. This type of signals have better noise immunity compared to the single ended type.
- $\Diamond$  There are two analog outputs that can work in parallel per ASIC. Eight ADC are need to cover the four board

Keeping in mind these condition the AD9222 from the Analog Devices [\[AD9\]](#page-116-1) has been chosen. It has a sampling rate of 40 MBPS and a resolution of 12 bits. Figure [4.5](#page-52-0)



shows the functional block diagram of the converter. There are 8 ADCs in the same package, which optimize space and layout for the DAQ board.

<span id="page-52-0"></span>Figure 4.5: AD9222 - functional block diagram.

The ADC operates with 1.8V power supply and LVDS sample rate clock. No external reference or driver components are required. The ADC automatically multiplies the sample rate clock for the appropriate LVDS serial data rate. A data clock output (DCO) for capturing data on the output and a frame clock output (FCO) for signaling a new output byte are provided by the ADC. It typically consumes less than 2 mW when all channels are disabled. For best dynamic performance, the source impedance driving  $VIN + x$  and  $VIN - x$  should be matched such that common-mode settling errors are symmetrical. These errors are reduced by the common-mode rejection of the ADC. An internal reference buffer creates the positive and negative reference voltages, REFT and REFB, respectively, that define the span of the ADC core. The output common-mode of the reference buffer is set to mid-supply, and the REFT and REFB voltages and span are defined as:

$$
R_{EFT} = 0.5.(AVDD + V_{REF}) = 0.5(1.8 + 1) = 1.4V
$$
\n(4.4)

$$
R_{EFB} = 0.5.(AVDD - V_{REF}) = 0.5(1.8 - 1) = 0.4V
$$
\n(4.5)

$$
S_{pan} = 2.(R_{EFT} - R_{EFB}) = 2.V_{REF} = 2V
$$
\n(4.6)

It can be seen from these equations that the  $R_{EFT}$  and  $R_{EFB}$  voltages are symmetrical about the mid-supply voltage and, by definition, the input span is twice the value of the  $V_{REF}$  voltage. Maximum SNR performance is achieved by setting the ADC to the largest span in a differential configuration. In the case of the AD9222, the largest input span available is  $2V_{pp}$ . The signal to noise ratio (SNR) can be calculated with the formula:

$$
SNR = (6.02n + 1.76)dB,
$$
\n(4.7)

where n is the ADC resolution. In the case of AD9222 the resolution is 12 bits so the  $SNR = 74$  dB. With the resolution and the input range, the LSB of the ADC can be calculated with the formula:

$$
1LSB = \frac{S_{pan}}{2^9} = \frac{2V_{pp}}{4096} = 0.5 mV
$$
\n(4.8)

The effective number of bits (ENOB) is related to signal-to-noise and distortion ratio (SINAD) by the following equation:

$$
ENOB = \frac{SINAD - 1.76}{6.02} = \frac{70 - 1.76}{6.02} = 11.33 bits \tag{4.9}
$$

The ENOB is a way to measure the quality of digitized signal. In the case of AD9222 only 0.7 bits are below the noise level.

Regarding the static parameters of the ADC, according to the data sheet, the integral non-linearity (INL) is  $\pm 0.4$  LSB following the straight line method and the differential non-linearity (DNL) is  $\pm 0.3$  LSB. This means that the measured value can be deviated from it real value with less than half of bit.

The ADC contains several features, such as programmable clock and data alignment, and programmable digital test pattern generation. The available digital test patterns include built-in deterministic and pseudo-random patterns, along with custom userdefined test patterns programmed via the serial port interface (SPI). The AD9222 architecture consists of a pipeline ADC. This architecture permits the first stage to operate with a new input sample while the remaining stages operate with preceding samples. In the last stage the data is serialized and aligned to the frame and data clocks. The LVDS driver current is derived on chip and sets the output current at each output equal to a nominal  $3.5 \text{mA}$ . A  $100\Omega$  differential termination resistor placed at the LVDS receiver inputs results in a nominal 350mV swing at the receiver.

The octal ADC covers the needed of eight analog channels. The 12 bits over the  $2Vpp$ input range give the 0.5mV resolution per bit which is sufficient for the application. There is one conflict point with this ADC and the ASIC. The ASIC gck/gckb clock signal is specified to 1MHz in this application (even if it can run up to 10MHz) and this is a very low frequency for the ADC. The minimum sampling rate of the converter defined by the input clock signal can be 5MHz. This question has been solved by providing a 16MHz sampling clock to the ADC and a 1MHz clock to the ASIC. An algorithm has been implemented into the firmware that selects one sample out of the 16 received for each channel in case of the serial running mode. In this way only one sample is stored for the next stage of data processing. This technique leaves the option to implemented an algorithm in the future that selects and saves temporary several samples per readout channel. Obtaining the average value of the samples taken for each channel both the noise and the error of differential nonlinearity of the converter can be reduced. Figure [4.6](#page-54-0) shows the schematic of the interconnected signals between the ADC and the FPGA.

In scope mode all 256 consecutive samples of the ADC are stored. In this mode the ASIC works without gcb clock and the analog waveform from the ASIC has to be reconstructed.



<span id="page-54-0"></span>Figure 4.6: Signals between the ADC and the FPGA.

The FPGA sends to the ADC a 16 MHz sampling clock (CLK). This frequency can be changed easily by the firmware program if needed. Data from each ADC is serialized and provided on a separate channel. The data rate for each serial stream is equal to 12 bits times the sample clock rate in this application i.e.

$$
12bits \times 16MSPS = 192Mbps \tag{4.10}
$$

Two output clocks are provided to assist in capturing data from the AD9222. The digital clock output (DCO) is used to clock the output data and is equal to six times the sample clock (CLK) rate. Data is clocked out of the AD9222 and must be captured on the rising and falling edges of the DCO in the so called double data rate (DDR) capturing. The frame clock output (FCO) is used to signal the start of a new output



<span id="page-55-0"></span>Figure 4.7: Timing diagram, 12-Bit Data Serial Stream, LSB First, AD9222, [\[AD9\]](#page-116-1).

byte and is equal to the sample clock rate (see the timing diagram shown in Figure [4.7\)](#page-55-0). Using an ADC with serial output instead of parallel makes the layout signal routing much simple. In general the serial readout creates a certain delay in the data, in comparison with the parallel readout, but in the case of the selected ADC. It feature six time faster readout output clock, respect to the sampling clock including a DDR readout. This makes possible the reading of the 12 bit data at the rate of the sampling clock.

#### 4.3.4 Bias signals

The bias signals are analog voltages levels necessary for the proper functionality of the ASIC. The Vata64HDR16 has like 22 biases, but almost all of them are internally generated, based on the reference Master bias or mbias. However, sometimes it is necessary to adjust or force the biases to other values than the nominal for that the hybrid board include access to all biases of the chip. However, previous experience with the chip shows that only a few of the biases are really needed to be controlled externally. The most used biases and its characteristics are:

- $\Diamond$  Mbias: Master bias it is a current bias with nominal value of  $700\mu A$ . To sense the Mbias value a 5 K $\Omega$  resistor is places in series on the line. By measuring the voltage drop on the resistor the current passing through the resistor can be calculated. The resistor value has been chosen so the maximum DAC value, which is 4096mV, it does not exceed the maximum value acceptable by the ASIC. The DAC resolution is 1mV per bit, which correspond to 200nA per bit. This resolution is completely enough for the application.
- $\Diamond$  VFS: Voltage Feedback for the slow shaper. It is a voltage bias with no specified nominal value. In the DAQ board a positive voltage from 0 to 2.5V can be

generated. An analog switch is considered in the design so the input to the ASIC can be completely disconnected in case it is not used. In this way any influences from possible offsets from the DAC on the sensible bias input are prevented any influences from possible offset from the DAC on the sensible bias input. In normal use this bias is disabled.

 VTHR: Threshold for the discriminator. This bias can be a positive or negative voltage in relation to the polarity of the analog input signal. The range is  $\pm 2.5V$ .

In the HDRDAQ board these biases are generated by a digital to analog converters (DACs) and are controlled by the software. The polarity of the bias is selected from analog switches which can be enabled or disabled, except the master bias which is always enabled.

In the case of STiC chip all biases are configurable through the slow control configuration register. Internal for the chip DACs are in charge to generate the voltage levels. This relieves considerably the external components needed for the ASIC operation.

#### DAC characteristics

The DAC selected for the application is the LTC2620 from Linear Technology [\[Lit\]](#page-116-2). It integrates 8 channels with 12 bit resolution. Figure [4.8](#page-56-0) shows the internal block structure of the converter.



<span id="page-56-0"></span>Figure 4.8: LTC2620 internal block diagram.

The DAC uses a 32-bits shift register, which is accessed by SPI serial interface for remote programming. The digital to analog transfer function is:

$$
V_{OUT(IDEAL)} = \left(\frac{k}{2^N}\right) . V_{REF},\tag{4.11}
$$

where  $k$  is the decimal equivalent to the binary DAC input code,  $N$  is the number of bits and the voltage reference  $V_{REF} = 4.096V$  is the maximum output voltage. The obtained resolution is 1 mV per bit. The  $V_{REF}$  is given by LT1790B from Linear Technology. It provides a stable voltage reference for the DAC.

The next section are dedicated to the digital signals generated and received form the HDRDAQ board.

# 4.4 Digital part

This part explains the digital  $in/out$  signals for the ASIC. It includes the digital *logical* and differential signals. The complexity of this part comes from the fact that the DAQ board has to work with different ASICs types. The power supply between different ASIC can be different and so the voltage levels of the digital signals. For that reason the power supply for the hybrid boards is externally provided, and it goes to all drivers concerning the ASIC interface. The drivers selected for the application accept  $\pm 5V$ . This range is wide enough respect to the need of the ASIC  $\pm 2.5V$  power supply in the case of the Vata64HDR16 chip. The following section explains the design of the interface signals between the HDRDAQ board and the ASICs Vata64HDR16 and STiC.

#### Digital signals from the ASIC to the FPGA

The digital signal from the ASIC to the FPGA are regout and  $\textit{shift\_out\_b}$ . The regout signal is the out of the slow control configuration register and the  $\textit{shift\_out\_b}$  is the output of the *shift in b* signal. They are of *logical* type or its range is from  $\pm 2.5V$ . The FPGA accept only positive signals so these signals can not be connected directly to the FPGA. For the interface of the LT1715 comparator from Linear Technology with independent input/output power supply signals is used. The chosen input power is  $\pm 5V$  and the output power is  $+3.3V$ . A dual package is used in the design for better layout symmetry.

#### Digital signals from the FPGA to the ASIC

This section refers to the signals related to the readout of the analog output of the ASIC and the slow control. The signals coming from the FPGA are LVTTL (3.3V) and the ASIC expect signals with logical levels  $0' = -2.5V$  and  $1' = +2.5V$  (in case of Vata64HDR16). The interface of the signals of type 'logical' between the FPGA and the ASIC is done also through comparator. In this case the rail-to-rail comparator from Maxim MAX9034 is employed. It is powered with  $\pm 2.5V$  and connected to a bank with  $+2.5V$  power supply. A package with 4 comparators is used, so 2 chips provide the signals needed for a single chip.

#### 4.4.1 The trigger

When the chip receives a valid input signal after the integration time, it will generate a trigger signal. The DAQ board receives this signal and will start the readout sequence. The ASIC output signal is a current mode logic differential one and its amplitude is not enough to be understood from the FPGA directly as a standard LVDS signal. The interface of the trigger signal in the HDRDAQ board is done in 2 stages. First it passes through an amplifier and then it goes to a comparator. The resulting signal for the FPGA is of LVDS type. A copy of the trigger signal is sent to a connector to be shared with others DAQ boards. The idea is that the four HDRDAQ boards receive a copy of the trigger signal and every DAQ board can 'decide' by itself if there is a coincidence event with one of its own triggers or not, and start the readout sequence if it is so. Matching coincidence event, before the readout process starts, can save a lot of dead time, in terms of reading background events. It also reduces significantly the amount of data to be transmitted to the PC.

Figure [4.9](#page-58-0) shows a sketch of the trigger conversion path and the Figure [4.10](#page-59-0) shows the simulation of the amplification stage. The  $20\mu\text{A}$  input signal is amplified and inverted. Later it is inverted again within the next stage. The resulting positive signal is send to the FPGA.



<span id="page-58-0"></span>Figure 4.9: Sketch of the current mode logic to LVDS conversion.

## 4.4.2 Differential signals from the FPGA to the ASIC

This section refers to the Maresm/Maresp reset signal. The conversion from LVTTL to LVDS (2.5V) employs a zener diode, a resistor and the DS90C031 driver from Texas Instruments. The reverse current through the zener diode is limited with the resistor to  $125\mu$ A. This produces a voltage drop of 2.4V. The zener diodes interface the voltage levels from 0V to -2.4V and from 3.3V to 0.9V. The driver needs to be powered with



<span id="page-59-0"></span>Figure 4.10: Simulation of the amplification stage of the trigger signal.

 $\pm 2.5V$  to fulfill the minimum voltage range of 4.5V required for it correct operation and to match the power supply of the ASIC. Figure [4.11](#page-59-1) shows the schematic of the conversion.



<span id="page-59-1"></span>Figure 4.11: Schematic for the interface of the LVTTL to LVDS.

## 4.4.3 Test block

The test part of the system is very important to check the proper operation of the ASIC and also for its calibration. The ASIC can be configured into test-mode by setting the bit test on in the control register to '1'. The test enable mask must have only one of its 64 bits set to '1' which will select the corresponding channel for the test. This channel will be sensitive for the test signal input. The test pulse has to be a square pulse with voltage step of range (10-1000)mV on a 47pF capacitor. This is equivalent to a signal charge from  $(0.47-47)$  pC, which covers the input range of the chip. Two type of test pulse signals are possible for the HDRDAQ board: 1) external - generated from an external pulse generator and 2) internal - generated from the DAQ board. A passive switch is included to select which of the two option is used. The internal test pulse is generated by 4 DACs. The analog signal is passed through a analog switch, which will alternate its output between its two inputs: 1) the DAC signal and 2) a ground signal. The switch and the DAC are controlled by the FPGA. The DAC specifies the amplitude of the pulse and its width. In this case only, positive test signals are considered.

#### 4.4.4 Clock signals

There are four oscillators implemented on the HDRDAQ board. One with a frequency of 16 MHz passes through the FPGA and is is send to the ADC. Another one is 40MHz for the FPGA general firmware needs. There are two more oscillators on the board. One is a 25MHz clock for the PHY and the other is a 620Mhz reference clock for the STiC TDC. The interface of this clock is explained in the next section. The clocks signals are connected to a special general clock buffer input of the FPGA. The oscillators are biased on a digital 3.3V supply level, decoupled with a 100 nF capacitor.

#### 4.4.5 STiC signals

Table [4.2](#page-61-0) lists the signals interconnected between the HDRDAQ board and the STiC ASIC. The STiC chip is a free running chip, so no trigger signals are generated. Also the data output, the TXD signal, is a 3.3V LVDS signal and can be directly connected to the FPGA. It needs two fast clocks: 1) 160MHz clock for the digital part of the chip and also for the synchronization of the data out and 2) 620MHz clock for the PLL of the TDC. The chip is configured trough a SPI slow control register with the signals:  $\mathfrak{se}$ . sdi, sdo and sclk. They are 3.3V CMOS signals and are also directly connected to the FPGA. Four STiC chips can be connected to the HDRDAQ board. The reset signal is used to reset the PLLs in the chips synchronously. This is needed to synchronize the start time of the TDC counters so all chips have the same time. To keep the time the same, there is one more condition and it is that the all the TDCs work with the same reference clock. To fulfill this, a 620MHz clock is generated from an oscillator and then it is fanned-out to the 4 connectors dedicated to the STiC chips. This creates a clock tree and guarantees the same clock references for all the chips.

## 4.4.6 FPGA

The main component of the board is the FPGA. It implements the logic for the control of the hybrid boards, the acquisition process, the data package management and the communication with the PC. The FPGA chosen for the design is the spartan-6 XC6SLX75-FG676 from Xilinx. The main reason for this choice is the 408 input/output user pins, its 172K bits block RAMs and its low price. The following table shows a summary of the Spartan-6 FPGA attributes (Table [4.3\)](#page-61-1).

| <b>Group</b>         | Name        | Type          | <b>Dirrection</b>       |
|----------------------|-------------|---------------|-------------------------|
| Slow control signals | SC          | CMOS 3.3V     | $FPGA \Rightarrow ASIC$ |
|                      | sdi         | CMOS 3.3V     | $FPGA \Rightarrow ASIC$ |
|                      | sclk        | CMOS 3.3V     | $FPGA \Rightarrow ASIC$ |
|                      | sdo         | CMOS 3.3V     | $ASIC \Rightarrow FPGA$ |
|                      | reset       | CMOS 3.3V     | $FPGA \Rightarrow ASIC$ |
| Readout signals      | 160 MHz clk | <b>LVDS</b>   | $FPGA \Rightarrow ASIC$ |
|                      | 620 Mhz clk | <b>LVPECL</b> | $FPGA \Rightarrow ASIC$ |
|                      | TXD         | <b>LVDS</b>   | $ASIC \Rightarrow FPGA$ |

<span id="page-61-0"></span>Table 4.2: List of the signals between the HDRDAQ board and the  $STiC$  chip.

|               |                       |               |                   | <b>Configurable Logic Blocks (CLBs)</b> |                                 | <b>Block RAM Blocks</b> |          | <b>Memory</b> |                                                     | Total                      | Max               |
|---------------|-----------------------|---------------|-------------------|-----------------------------------------|---------------------------------|-------------------------|----------|---------------|-----------------------------------------------------|----------------------------|-------------------|
| <b>Device</b> | Logic<br><b>Cells</b> | <b>Slices</b> | <b>Flip-Flops</b> | Max<br><b>Distributed</b><br>RAM (Kb)   | <b>DSP48A1</b><br><b>Slices</b> | <b>18 Kb</b>            | Max (Kb) | <b>CMTs</b>   | <b>Controller</b><br><b>Blocks</b><br>$(Max)^{(6)}$ | <b>I/O</b><br><b>Banks</b> | User<br><b>VO</b> |
| XC6SLX75      | 74.637                | .662<br>1     | 93.296            | 692                                     | 132                             | 172                     | 3.096    | 6             |                                                     | 6                          | 408               |

<span id="page-61-1"></span>Table 4.3: Summary of Spartan-6 FPGA Attributes.

## 4.4.7 Spartan-6 power supply decoupling capacitors

The bypass and decoupling capacitors of the FPGA are of vital importance since the FPGA will be probably the main source of noise of the board. Two main reasons are: first, the FPGA will be the most demanding integrated circuit of the board in terms of power; second, the transient current demands will be higher in the FPGA than in other ICs due to its nature. Therefore, the number of capacitors, their values, type and their situation regarding the FPGA power pins are issues to be considered [\[Ale\]](#page-117-4). Each bank has separated  $V_{CCO}$  lines. Thus, Spartan-6 devices in these packages support six independent  $V_{CCO}$  supplies. The  $V_{CCO}$  voltage supplies will bias the IOBs of the corresponding bank. Therefore, in the same bank inputs/outputs with compatible formats should be used (i.e. with the same  $V_{CCO}$  level). In this design just only one bank has  $V_{CCO}$  equal to  $+2.5V$ , the rest of the banks use  $+3.3V$ . Concerning the number of capacitors, the basic rule is to have at least one capacitor per  $V_{CC}$  pin used on the device. Therefore, the effective number of  $V_{CCO}$  pins for each supply must be determined. All supplies must be considered:  $V_{CCINT}$ ,  $V_{CCAUX}$  and  $V_{CCO}$ .  $V_{CCAUX}$ and  $V_{CCINT}$  pins must always be fully decoupled, that means, they must always have one capacitor per pin.  $V_{CCO}$  can be prorated according to I/O utilization. The number of  $V_{CCO}$  pins used by a device can be determined based on the Simultaneously Switching Output (SSO) restrictions given in the device documentation. When a large number of outputs simultaneously switch in the same direction a ground or power bounce occurs. The utilization of  $I/O$  resources in a bank determines the percentage of the budget used. This percentage effectively represents the percentage of  $V_{CCO}$  pins used by the device. The SSO describes the maximum number of user output pins, of a given output signal standard, that should simultaneously switch in the same direction, while maintaining a safe level of switching noise. Table [4.4](#page-62-0) summarizes the utilization of all I/Os for the FPGA, listed per bank and standard. The maximum number of I/O

#### 4.4. DIGITAL PART 51

pins per bank is taken from [\[Ppg\]](#page-117-5), Table 2-1. The limit of I/O for this type of FPGA is calculated by multiplying the available power pins by the maximum I/O per power pin. Then taking into account the number of used I/O pins the effective number of pins for each bank using the next formula can be calculated. For example, for bank 0 with 41 used pins and limit  $= 108$  pins, the effective power pins needed is 37,96% of the total budget of the bank.

| Bank | ∣ Voltage | Power<br>pins | Max I/O<br>per power<br>pins | Limit<br>I/O | l/O<br><b>Utilization</b> | <b>Effective pins</b><br>per I/O [%] | Used power<br>pins |
|------|-----------|---------------|------------------------------|--------------|---------------------------|--------------------------------------|--------------------|
| 0    | 3.3V      | 12            |                              | 108          | 41                        | 37,96                                | 5                  |
|      | 3.3V      | 11            | 10                           | 110          | 70                        | 63,64                                |                    |
| 2    | 3.3V      | 8             | 9                            | 72           | 49                        | 68,06                                | 6                  |
| 3    | 3.3V      | 10            | 9                            | 90           | 59                        | 65,56                                |                    |
| 4    | 2.5V      | 6             | 8                            | 48           | 44                        | 91,67                                | 6                  |
| 5    | 3.3V      | 6             |                              | 54           | 33                        | 61,11                                |                    |

<span id="page-62-0"></span>Table 4.4: Calculation of the effective I/O pins per bank of Spartan 6 (XC6SLX75) FPGA.

$$
used = \frac{used}{limit} = \frac{41}{108} = 37,96\%,\tag{4.12}
$$

The number of  $V_{CCO}$  pins used in a bank (Table [4.4\)](#page-62-0) is the number of  $V_{CCO}$  pins in a bank times the percentage of the effective pins. For bank 0, five power pins are needed to fulfill the power supply needs if all 41 pins commute at the same time. This is the worst case scenario, since in the actual design only a few pins will be commute simultaneously. The total number of used power pins for the  $V_{CCO}$  is 35 pins.

Given the number of discrete capacitors needed as determined above, a distribution of capacitor values adding up to that total number must be determined. To cover a broad range of frequencies, a broad range of capacitor values must be used. The proportion of high-frequency capacitors to low-frequency capacitors is an important factor. A ratio of capacitors giving relatively flat impedance is one where the quantity of capacitors is roughly doubled for every decade of decrease in size. Table [4.5](#page-63-0) shows the set of percentages, recommended from the manufacturer of the FPGA, helpful for calculating these ratios based on the total number of capacitors.

The total number of used power pins for the  $V_{CCO}$  is 35. Following the rule of one capacitor per pin, 35 capacitors are needed for the correct power decoupling for the Vcco network. Table [4.6](#page-63-1) shows how the quantity of each value of capacitor is determined. This calculation gives a first-pass estimate of the capacitors necessary for the  $V_{CCO}$ supply. Changes can be made to the exact number of capacitors to accommodate different values and to make the supply more symmetric. The process of capacitor selection for the  $V_{CCAUX}$  and  $V_{CCINT}$  supplies is based on the recommendation from [\[Ppg\]](#page-117-5).

| <b>Capacitor Value</b> | <b>Quantity Percentage</b> | <b>Capacitor Type</b> |
|------------------------|----------------------------|-----------------------|
| 470 μF to 1000 μF      | 4%                         | Tantalum              |
| 1.0 to 4.7 $\mu$ F     | 14%                        | X7R 0805              |
| 0.1 to 0.47 $\mu$ F    | 27%                        | X7R 0603              |
| 0.01 to 0.047 $\mu$ F  | 55%                        | X7R 0402              |

<span id="page-63-0"></span>Table 4.5: Percentages for different capacitor value for a balanced decoupling network  $[Dec].$  $[Dec].$ 

| <b>Capacitor Value</b> | Calculated                          | <b>Quantity of Capacitors</b> |
|------------------------|-------------------------------------|-------------------------------|
| 470 µF                 | $35 \text{ pins} \times 4\% = 1,04$ |                               |
| 4.7 uF                 | 35 pins x $14\% = 4.9$              |                               |
| 470 nF                 | 35 pins x $27\% = 9,45$             |                               |
| 22nF                   | 35 pins x $55\% = 19,25$            | 19                            |

<span id="page-63-1"></span>Table 4.6: Calculation of the number of capacitors required for  $V_{CCO}$  supply voltage.

Table [4.7](#page-64-0) shows the number and the value of the capacitors calculated and placed on the board. As a result, a minimum of 51 capacitors are needed for the decoupling of the FPGA. In practice, 59 capacitors are placed on the board. Tantalum  $47 \mu$ F capacitors are used for low-frequency capacitance because of their higher ESR (Equivalent Series Resistance) than ceramic chip capacitors, making them less likely to contribute to antiresonance spikes. For the rest of the values ceramic capacitors are used. The capacitors are symmetrically placed around the FPGA. The smaller values are located as close as possible to the power pins for better decoupling effect.

## 4.4.8 Flash memory

The FPGA does not retain the information after power down so it has to be reprogrammed each time after turned on. The flash memory is the device which contains the configuration bit file for the programming of the FPGA. Flash memory is a non-volatile computer storage chip that can be electrically erased and reprogrammed. The platform flash PROM implemented in the design is a XCF32P from Xilinx [\[Fla\]](#page-116-3) with density of 32Mbit configuration bits. The memory can be programmed in-system via the standard 4-pin JTAG (Joint Test Action Group) protocol. The programming data sequence is delivered to the flash memory or directly to the FPGA using both Xilinx iMPACT software and a Xilinx download cable.

Figure [4.12](#page-65-0) shows connection of the flash memory and the FPGA implemented in to HDRDAQ board. From the different types of configuration, a Master SelectMAP(parallel) mode have been chosen. In Master SelectMAP mode, byte-wide

| Capacitor   | Vcco $3,3$ V | Vccaux 2.5 V | Vccint 1.2 V             | <b>Total N</b> | Placed N |
|-------------|--------------|--------------|--------------------------|----------------|----------|
| 470 μF      |              |              |                          |                |          |
| $4.7 \mu F$ |              |              |                          | 10             |          |
| 470 nF      |              |              |                          | 18             | 21       |
| 22 nF       | 19           |              | $\overline{\phantom{a}}$ | 19             | つつ       |

<span id="page-64-0"></span>Table 4.7: Summary of calculated and used capacitors on the board.

data is written into the FPGA, typically with a *BUSY* flag controlling the flow of data, synchronized by the configuration clock  $(CCLK)$  generated by the FPGA.

## 4.4.9 Ethernet connection

The Internet Protocol version 4 (IPv4) is the most widely used Internet Layer protocol. The combination of the IPv4 with User Datagram Protocol (UDP) [\[Xip\]](#page-116-4) represents the optimal solution of hardware resource requirements for data transmission between a host PC and FPGA board. Because of the comparatively small protocol overhead, UDP allows for high transmission rates while at the same time requires low programming overhead on the PC side. UDP is based on datagrams that are encapsulated into Ethernet packets.

Most modern FPGAs contain EMAC (Ethernet Media Access Controller) blocks that allow for direct access to external physical layer (PHY) devices on the board. A PHY is required for the EMAC to connect to an external device, for example, a network, a host PC, or another FPGA. The Xilinx Core generator can be used to configure and generate EMAC wrapper files that contain a user configurable Ethernet MAC physical interface, e.g., MII, GMII, SGMII.

In the case of HDRDAQ board the communication with PC is done through Ethernet protocol. For this purpose the  $LogiCORE^{TM}$  IP - Tri-Mode Ethernet Media Access Controller (TEMAC) solution with the 10/100/1000 Mb/s Ethernet MAC, supporting half-duplex and full-duplex operation is implemented into the FPGA.

The DP83865 transceiver from National Semiconductors [\[DP8\]](#page-116-5) is used for the physical layer. It supports 10BASE-T, 100BASE-TX and 1000BASE-T Ethernet protocols. From one side this device interfaces directly to the MAC layer through the IEEE 802.3u Standard Media Independent Interface (MII) or the IEEE 802.3z Gigabit Media Independent Interface (GMII). From another side it interfaces directly to twisted pair media via an external transformer included into the connector RJ-45. The Twisted Pair Interface consists of four differential media dependent  $I/O$  pairs  $(MDI<sub>-</sub>A, MDI<sub>-</sub>B,$  $MDI_C$ , and  $MDI_D$ , terminated with a 49.9  $\Omega$  resistor.

The PHY is configured in auto-negotiation mode and the three speed of transmission are available (10/100/1000 Mbs), full and half duplex. For the synchronization of the



<span id="page-65-0"></span>Figure 4.12: Scheme employed for the connection of the flash memory and the FPGA Master SelectMAP (Parallel) Mode. Figure from [\[Ads\]](#page-116-6).

TEMAC and the PHY, the last one provides a clock output which frequency can be selected through a resistor switch to be 25MHz for 100Mbs transmission and 125MHz for 1Gbp transmission. There are 5 LEDs that indicate the actual status of the link. They will indicate the velocity of the transmission, if the full-duplex is activated and each time when a packet passes through the link the active led will blink. A special attention is given to the power supply decoupling capacitors. For each power pin a capacitor is provided for noise reduction. A reset signal with duration of 200 ms is provided by the FPGA through the S3-PHY-RESET pin.

## 4.5 Power supply

This HDRDAQ board use a desktop adapter power supply that provides a 6V maximum output. The supply is connected to the board with a 2.5mm inner diameter jack connector. The 6V supply is fuse-protected and conditioned before connecting to the voltage regulators that supply the proper bias to each of the various sections on the board. Table [4.8](#page-66-0) summarizes the voltage levels of the HDRDAQ board. The board needs a single power supply of  $+6V$  and the rest of the levels are generated from the power block of the board. There are separate voltages for the analog section and the digital section of the board. A special attention is given to the analog amplifiers and the ADC. The  $\pm 5V$  is used for the comparators and for general purpose needs of the board. The FPGA needs +3.3V, +2.5V and +1.2V voltages. The Ethernet transceiver uses  $+2.5V$ ,  $+3.3V$  and  $+1.8V$ , and the flash memory needs  $+3.3V$  and  $+1.8V$ .

| External voltage levels | Consumer                       |
|-------------------------|--------------------------------|
| $+6V$                   | Main power board               |
| $\pm 2.5V$              | ASIC (Vata64HDR16) and drivers |
| <b>Generated</b>        |                                |
| voltage levels          | Consumer                       |
| ±5V                     | General purpose                |
| $\pm 5V$ A              | Analog amplifiers              |
| $+1.2V$                 | <b>FPGA</b>                    |
| $+2.5V$                 | FPGA, PHY                      |
| $+3.3V$                 | FPGA, FLASH, PHY               |
| $+1.8V$ D               | ADC digital                    |
| $+1.8V$ A               | ADC analog                     |
| $+1.8V$                 | PHY and FLASH                  |

<span id="page-66-0"></span>Table 4.8: Summary of the power supply voltage levels used in the HDRDAQ board.

The HDRDAQ board is designed considering the compatibility with different ASICs versions from the same family or with similar pin-out characteristics. This leads to different voltage levels and current consumption. The drivers on the board responsible for the interface with the ASIC accept up to  $\pm 5V$  so there are compatibles for different voltage levels up to  $\pm 5V$ . The current consumption depends of the number of the ASIC placed on the hybrid board. For that reason the power supply for the ASICs is provided from external power supply equipment.

#### 4.5.1 Voltage converters

Low Dropout Regulator (LDO) and DC-DC converters are used for the generation of the different levels of power supply on the HDRDAQ board. LDO provides regulation, stability and reduces the noise associated with power switching supplies. The use of regulators produces a very good rejection at low frequencies and some rejection at higher frequencies. The most important parameter is the dropout voltage, which is defined as the minimum voltage drop required across the regulator to maintain output voltage regulation. A critical point to be considered is that the linear regulator that operates with the smallest voltage across it dissipates the least internal power and has the highest efficiency. Despite of the similar principle of the most of LDO, large differences are found in the different models available. Thus, features considered for the selection of the regulators were the following: 1) Small voltage drop to reduce the

power consumption and heating, 2) High ripple/noise rejection to provide additional ripple and noise reduction to the obtained by capacitors for switching power supplies, 3) Compatibility with low ESR output capacitors: many regulators can be operated only with tantalum capacitors or similar, which have a relatively large ESR, not being an optimal solution for noise reduction at high frequencies, 4) Over-current and thermal shutdown.

The LDO selected are the ADP3339 (for  $+5V$  and  $+1.8V$ ) from Analog Devices, and the LT1185 (for -5V) from Linear Technology. Their characteristics are summarized below.

#### ADP3339

The ADP3339 is low dropout, voltage regulators. It operates with an input voltage range of 2.8V to 6V and delivers a load current up to 1.5A and requires only a  $1.0 \mu$ F output capacitor for stability. The ADP3339 has typical drop voltage of 230mV at full load and a maximum output current. Figure [4.13](#page-67-0) shows the voltage dropout versus the load current. As it can be seen, working with currents smaller than the maximum becomes an advantage because the resulting reductions in the voltage drop.



<span id="page-67-0"></span>Figure 4.13: Dropout voltage vs. load current [\[ADP\]](#page-116-7).

The ripple rejection of the ADP3339 is very good compared to other available devices, decreasing for frequencies above 100kHz but still providing some rejection at high frequencies, as shown in Figure [4.14.](#page-68-0)

The stability and transient response of the LDO depends on the output capacitor. The ADP3339 is stable with a wide range of capacitor values, types, and ESR. A capacitor as low as  $1\mu$ F is sufficient to provide the required stability when is placed near the output and ground pins. The ADP3339 is stable also with extremely low ESR capacitors (ESR  $\approx$  0) such as multilayer ceramic capacitors. The regulator is short-circuit protected by limiting the drive current in the transistor base. It is also protected against damage due to excessive power dissipation by its thermal overload protection circuit.



<span id="page-68-0"></span>Figure 4.14: Power supply ripple rejection (a) and noise spectral density vs. frequency (b) of the  $ADP3339$   $[ADP]$ .

#### LT1185

For the generation of the -5V, the low dropout regulator LT1185 from the Linear Technology is used [\[LT1\]](#page-116-8). For a maximum output current of 3A, the LT1185 produces a typical voltage drop of 0.75V. The input voltage, in case of HDRDAQ, is -6V and the produced output voltage is -5V, with  $\pm 1\%$  accuracy on the internal reference voltage. The regulator high efficiency is maintained by implemented special anti-saturation circuitry that adjusts the base to track load current. The on resistance is typically 0.25Ω. The LT1185 has power limiting and thermal shutdown protection features.

#### PTH05050W

The power supply for the digital part of the HDRDAQ board use DC-DC converters. In comparative with the LDOs, they can provide a higher output current but are more noisy because of its switching working principle. Since the digital signals are less affected from the power noise than the analog ones, this makes the DC-DC converters suitable for the power supply for the digital section of the board. The voltage regulator PTH05050W is selected for the conversion to  $+3.3V$ ,  $+2.5V$  and  $+1.2V$  mainly used from the FPGA, the flash memory, and the PHY. For the conversion from  $+6V$  to  $-6V$ the PTH04050W from Texas Instruments [\[Pth\]](#page-116-9) is employed. The regulator provides conversion from a  $+5V$  input voltage to output voltage in range of 0.8V to 3.6V, using single resistor. It can provide up to 6A of output current and has an efficiency of 95%. For protection against load faults, the regulator incorporates output over-current protection. Applying a load that exceeds the regulator's over-current threshold will cause the regulated output to shut down. The PTH05050W was chosen because of his high current capabilities, small size, protection circuit and that the same component can be configured to provide the three different voltage levels.

#### PTN04050a

The PTN04050a is positive-to-negative integrated switching regulator with adjustable output [\[Pth2\]](#page-116-10). Operating from a wide-input voltage range of 2.9V to 7V, the PTN04050a provides high-efficient voltage conversion for loads of up to 6W. The output voltage is set using a single external resistor, and may be set to any value within the range from -15V to -3.3V. This regulator also has a soft-start circuitry which, during power up slows the rate that the output voltage rises, thereby limiting the amount of in-rush current that can be drawn from the input source. The soft-start circuitry introduces a time delay (typically 60ms) into the power-up characteristic.

## 4.5.2 Power distribution

Figure [4.15](#page-70-0) shows a sketch of the power distribution adopted for the HDRDAQ board. The +6V input passes through a fuses for over-current protection and then it is splitted into two branches. One goes to the DC-DC converter that inverts the voltage polarity to -6V and then with a LDO regulator provides the -5V output voltage level. The other branch goes to a diode which will produce the drop-out of 0.5V and the resulting  $+5.5V$  is lead from one side to the LDOs for the generation of the  $+5V$  and  $+1.8V$ voltage levels. The second branch goes to a second diode to be further reduced to  $+5V$ which power the DC-DC converters for the generation of the digital voltages  $+3.3V$ ,  $+2.5V$  and  $+1.2V$ .

There is a separate input connector for the power supply for the hybrid boards and for the drivers used for the *logical output signals* for the ASIC. The reason to have a separate connector is the requirement for versatility to connect different hybrid boards with different ASIC which can have different voltage levels or current consumption. The output drivers will automatically adjust its output logic level in case the power supply is changed. This makes the HDRDAQ board flexible for future updates of the front-end electronics and detectors.

Each regulator is followed by ferrite beads. The ferrites provide an inexpensive way of adding high-frequency loss in a circuit without introducing power loss at DC or low frequencies. A ferrite bead enclosing a conductor provides the highly desirable property of increasing impedance as frequency rises. This effect suits highfrequency noise filtering of conductors carrying DC and low-frequency signals. At higher frequencies, the ferrite material interacts with the conductor magnetic field, creating the characteristic loss. Various ferrite materials and geometries result in different loss factors versus frequency and power level. As DC current and, hence, constant magnetic-field bias rise, the ferrite becomes less effective in offering loss.

After the ferrite beads there are three capacitors with values of 10nF, 100nF and  $47\mu$ F. In theory [\[XAP\]](#page-116-11), a set of capacitors in parallel (1 per decade) improves noise filtering. The capacitive effect of a capacitor due to its capacitance dominates at low frequencies, and improves with the capacitance as shown in Figure [4.16.](#page-71-0) At high frequencies the inductance of the capacitor dominates, limiting the noise rejection. At



<span id="page-70-0"></span>Figure 4.15: Power distribution scheme implemented in the HDRDAQ board.

some intermediate frequency, there is a peak at which the insertion losses improve due to the resonance frequency of the capacitance at the inductance. This peak is limited for the value of the ESR.

# 4.6 Complements

HDRDAQ includes extra elements for easier debugging of the board functionality. These are:

- $\Diamond$  A rotary switch which determinate the identity (ID) number of the board. In case more than one HDRDAQ board is used, then each board has to have a proper ID number to be distinguished from the others by the software.
- $\Diamond$  There are 10 LEDs and three push buttons mainly for debugging the firmware functionality during development. In any case, they are very useful when, for example, they are programmed to indicate the working mode, or to indicate whatever process, stage or condition to the user. One of the buttons is dedicated to the internal reset of the firmware without the need of reprogramming of the FPGA from the flash memory.



<span id="page-71-0"></span>Figure 4.16: Effect obtained paralleling some capacitors with different values.

 $\Diamond$  Most of the I/O pins from the FPGA which are not used are routed to a connector. Those pins are mainly used as outputs of some internal signals from the firmware to be monitored on a scope during debugging process of the board.

# 4.7 Layout considerations

The layout of the traces on the board plays an important role on the electronic performance of the board. The high density of traces and the characteristics of the signals make possible to do an automatic routing to a limited number of signals. Therefore most of the traces are routed manually taking into consideration their characteristics. The combination of signal trace and a reference plane forms a transmission line. All signals in a PCB system travel through transmission lines. Good signal integrity in a PCB system is dependent on having transmission lines with controlled impedance. Impedance is determined by the geometry of the traces and the dielectric constant of the material in the space around the signal traces and between the signal trace and the reference plane. In the HDRDAQ board many signals are low voltage differential signal. The LVDS standard is a way to communicate data using a very low voltage swing and provide very low power consumption. Simplified diagram of the LVDS driver and receiver is shown on the Figure [4.17.](#page-72-0) The driver output consists of a current source (3.5mA nominal) which drives one of the differential pair lines. The receiver has high DC impedance (it does not source or sink DC current), so the majority of driver current flows across the  $100\Omega$  termination resistor, generating about 350mV across the receiver inputs. When the driver switches, it changes the direction of current flow across the resistor, thereby creating a valid one or zero logic state.

Characteristic impedance equations for the possibilities of Stripline and Microstrip signal traces are shown in Table [4.9.](#page-72-1) In this figure, the medium is characterized by a dielectric constant  $\varepsilon_r$ , the impedance by  $Z_{DIFF}$  and S denotes the distance between traces, h the thickness of the board,  $W$  the trace width, t denotes the trace thickness


Figure 4.17: Simplified diagram of LVDS driver and receiver connected via 100Ω controlled differential impedance media.

and b the dielectric thickness (between ground planes). (Note that all geometric variables must be in the same dimensional units).

Microstrip

\n
$$
Z_{DIFF} \approx 2 \times Z_0 \left( 1 - 0.48e^{-0.96 \frac{S}{b}} \right) \Omega
$$
\n
$$
Z_{DIFF} \approx 2 \times Z_0 \left( 1 - 0.374e^{-2.9 \frac{S}{b}} \right) \Omega
$$
\n
$$
Z_0 = \frac{60}{\sqrt{0.457 \varepsilon_r + 0.67}} \ell_n \left( \frac{4b}{0.67 (0.8W + t)} \right) \Omega
$$
\n
$$
Z_0 = \frac{60}{\sqrt{\varepsilon_r}} \ell_n \left( \frac{4b}{0.67 \pi (0.8W + t)} \right) \Omega
$$

Table 4.9: Mathematical expressions for determining the characteristic impedance in microstrips and striplines.

Differential transmission is practically immune to power supply fluctuations [\[Hor\]](#page-116-0). For communication between circuits cards where there is no cheap way to provide low-impedance power distribution, the cost and extra space required for differential transmission is often less than the cost and space needed for improved power distribution cable.

The differential data transmission method used in LVDS is less susceptible to commonmode noise than single-ended schemes. Differential transmission uses to convey data information into two wires with opposite current/voltage swings instead of the one wire used in single-ended methods. The advantage of the differential approach is that noise is coupled to both wires as common mode (the noise appears on both lines equally) and is thus rejected by receivers sensitive only to the difference between both signals. Differential signals also radiate less noise than single-ended signals due to the cancellation of magnetic fields. In addition, the current mode driver is not prone to switching spikes, further reducing noise.

As differential technologies such as LVDS reduce concerns about noise, they can manage signals of lower signal voltage swings. This advantage is crucial, because it is impossible to simultaneously increase data rates and to achieve lower power consumption without using low voltage swings. The low swing nature of the driver implies that data can be switched very quickly. Since the driver is also in current mode, very low power consumption across frequency is achieved since the power consumed by the load (3.5mA  $x$  350mV = 1.2mW) stays almost constant.

In the HDRDAQ board there are many signals that need to be routed as a transmission lines. For example all the signals considering the Ethernet PHY block have to be of  $50\Omega$  trace impedance. All input/output signals from the ADC are differential signals, as well as many signals for the interface between the ASIC are LVDS which require careful impedance matching routing. To cover these needs the layer stack-up of the board was adapted to have specific layers that can be used for route of the adapted transmission lines.

For a given layer thickness, trace thickness and dielectric constant, the trace width and trace spacing are modified to achieve the desired differential impedance. In practice, manufacturers recommend the use of specific programs that calculate automatically the values of the width and separation of differential traces for a required impedance value.

In the case of HDRDAQ the transmission line geometry was calculated by the manufacture company according to the thickness and the characteristic of the used dielectric material. The resulting 12-layer PCB structure is shown on the Figure [4.18.](#page-74-0)

According to the specified layer structure and the distribution of the GND planes the transmission lines used in the HDRDAQ board are: microstrip, edge coupled microstrip, stripline, and edge coupled stripline. They are with the following geometry:

1. Microstrip and edge coupled microstrip transmission lines: used in TOP and BOT layer, when INNER1 and INNER11 are GND plane (Figure [4.19\)](#page-75-0).

2. Stripline routed in INNER2 layer when INNER1 and INNER3 are GND planes (Figure [4.20\)](#page-75-1)

Other considerations taken into account during the routing process are:

- $\Diamond$  Signal traces crossing a plane split are avoided, because this may cause unpredictable return path currents and would likely to result in signal quality failure as well as creating EMI problems.
- $\Diamond$  The traces are routed with a distance between them more than two times their width for reducing possible cross-talk.
- $\Diamond$  The distribution of the components was consider optimal for shorter line, less conflict lines, less via using, and better noise immunity considering separate analog and digital parts location.



<span id="page-74-0"></span>Figure 4.18: HDRDAQ stuck-up structure.

 $\Diamond$  A special attention was applied to the all signal traces of MII and GMII which are with  $50\Omega$  controlled impedance and especially to the clock lines with running frequency of 125MHz and 640MHz.

The complexity of the design due to multiple power voltages, the transmission line consideration and the density of traces made necessary the use of multi-layer PCB.

The used number of layers is determinate of:

- $\Diamond$  The use of BGA (Ball Grid Array) package for Spartan 6 with more than 600 pins needs several layers for the signal routing. Six layers are used in the case of HDRDAQ board.
- $\Diamond$  The power supply planes used in the board are for the following voltages: 3.3V, 2.5V, 1.2V (FPGA and FLASH, PHY), 1.8V (PHY and FLASH), 1.8V (ADC digital), 1.8V (ADC analog),  $\pm 5V$  (analog),  $\pm 5V$  (digital),  $\pm 2.5V$  (ASIC). Each of those voltages have its separate power plane.
- There are tree main ground planes, one for the analog circuits and one for the digital parts and one for the the DC-DC converters. The planes are isolated



<span id="page-75-0"></span>Figure 4.19: Top: Microstrip transmission line with  $Z=50\Omega$  and width 0.22mm. Bottom: Edge coupled microstrip transmission line with  $Z=100\Omega$ , width 0.12mm. and separation =  $0.165$ mm.



<span id="page-75-1"></span>Figure 4.20: Top:Stripline with  $Z=50\Omega$  and width 0.127mm. Bottom: Edge coupled stripline with  $Z=100\Omega$ , width 0.10mm. and separation = 0.17mm.

between them and connected in a common point through a ferrite for filtering. The ground planes are distributed considering the analog, digital and power part of the PCB and also to complete the transmission line requirements. All traces impedance are calculated with reference to the ground or power planes.

Almost all the layers share space for routing and for planes. The resulting board is shown in the Figure [4.21.](#page-76-0) It has a height of 167mm and a width of 193mm. The board layout is done considering the capacitance between power and ground planes, which can provide appreciable power supply decoupling for high edge rate circuits. This "plane capacitor" has very low ESR and ESL so that the plane capacitance remains effective at the frequencies so high that chip capacitors become ineffective. The PCB has solid planes for each of the supply voltages. The inter-plane capacitance between the supply and ground planes is maximized by reducing the plane spacing. The power and ground pins are directly connected to the planes and the decoupling capacitors are placed as close to the supply pin as possible.

The description of the layout concludes the explanation of the hardware part of the project. The next chapter presents the firmware implemented into the FPGA.

<span id="page-76-0"></span>

Figure 4.21: HDRDAQ board layout. Prints-screen showing TOP (green) and BOT (red) layers

# Chapter 5

# The firmware

The firmware is an important part of the system that allows to control the hardware via a PC by sending specific commands to the FPGA. Every command is recognized from the firmware and the corresponding signals are generated and distributed to the corresponding peripheral hardware. Everything has to be synchronized and the commands executed in the right moment. An additional complexity of the firmware comes from the fact that the developed system must be flexible and able to work in many different configurations (e.g. different read out modes, four independent hybrid boards which can have different configuration in terms of biases, ASIC configuration and others parameters, different trigger modes and so on). In this section the functional blocks of the firmware and their configuration options are described in detail.

## 5.1 Firmware requirements

The FPGA of the system implements several logic blocks to control and interface the different hardware parts and to carry on the communication with the PC. The main functions of the FPGA are listed below:

- $\Diamond$  Control the different blocks for the external devices (hybrids boards, ADC, DACs, PHY and others).
- $\diamond$  Synchronize all devices.
- $\diamond$  Synchronize the acquisition process.
- $\Diamond$  Implement a time stamping block for each recorded event.
- DACs programming. This include:
	- Biases control
	- Thresholds levels control
- Internal test pulse generation control
- $\Diamond$  Generate a software trigger.
- $\diamond$  Handle all the triggers conditions.
- $\Diamond$  Control of the communication with PC.
- $\Diamond$  Send, receive, recognize and execute commands coming from the software.
- Four hybrid boards have to work in parallel for an FPGA.
- $\diamond$  Synchronization with other FPGAs on another HDRDAQ boards

The firmware must also generate many variable parameters, as for example, there are two different running modes, which means that the control signals for the read out process change in time and sequence according to the mode selected. That influences also the size of the data packet that has to be sent to the PC. Another example is that the 872 bits slow control register, is different for each ASIC. The next section of this chapter describes in more detail the different functional blocks of the firmware.

## 5.2 Structure of the firmware

The firmware is developed in VHDL (Very High Speed Integrated Circuit Hardware Description Language) and a global overview of the firmware is sketched in Figure [5.1.](#page-80-0) The main functional blocks implemented in the FPGA are the following:

- $\Diamond$  The communication block the Ethernet link, including media access control (MAC - data communication protocol) core and the UDP (User Datagram Protocol) core.
- $\Diamond$  Instruction manager decoding the instruction words from the PC and generating words of response from the board to the PC.
- $\Diamond$  ADC manager this block manages the data from the 8 ADCs.
- $\Diamond$  Trigger manager this block is in charge of defining the trigger type selected for the acquisition process (e.g. run in coincidence, free run, with software generated trigger and others) and control the start of the acquisition process.
- $\Diamond$  ASIC configuration this logical block is in charge of the slow control configuration register for the ASICs
- $\Diamond$  ASIC readout this block takes care of the sequence of the signals for the read out of each event. These signals change according to the read out mode.
- $\Diamond$  Bias DACs programming this block configures the DACs and the analog switches which define bias polarity and values.
- $\circ$  TDC this block registers the time of arrival of each trigger.
- $\Diamond$  FIFO block this is a memory block which combines the data from the different functional blocks and arrange them in a packet for the transmission to the PC.



<span id="page-80-0"></span>Figure 5.1: Scheme of the functional blocks of the firmware.

#### 5.2.1 Instruction manager

This block is implemented in a firmware component named Evaluador. Its mission is to evaluate or recognize the incoming packet, replay to the PC and pass the received data to the corresponding block for further action. Its main functions are the following:

- $\diamond$  Check if the received packet is correct.
- $\Diamond$  Discard the packet if it is malformed and inform about the incidence.
- $\diamond$  Identify the command and start the orders associated to this command.
- $\Diamond$  Write the FIFO for the acknowledgement packets (ACKs) informing for the receiving status of certain packages.
- $\Diamond$  Write into the FIFO used for the configuration of the ASIC in case of receiving this command.

This component consists of a finite state machine (FSM) and the instance of the ACK FIFO memory.

The format of the packets coming from the PC to the FPGA is shown in Figure [5.2.](#page-81-0) It consists of a 10 bytes word starting with the hexadecimal code  $'CAFE'$ . Then there is a byte for the destination hybrid board and another byte for the operational code that indicates which command to be accomplished. The possible commands and their values for the operational codes (COP) are shown in the Table [5.1.](#page-81-1) The next 4 bytes indicate the values that have to be configured. For example, if the operational code is 'Config Biases' then the parameters are the type of the bias that has to be configured and the desired value. Only for the command 'Conf  $ASIC$ ' the parameters field is much longer since it hosts the 872 bits-long register per ASIC. The last two bytes are dedicated to the hexadecimal code 'ACAB' indicating the end of the package.



<span id="page-81-0"></span>Figure 5.2: Common packet formar.

|            | Command            | Value (COP) |
|------------|--------------------|-------------|
| PC to FPGA | Initial config     | "0001"      |
|            | Hold delay         | "0010"      |
|            | <b>Bias</b> config | "0011"      |
|            | Mbias config       | "0100"      |
|            | Triger config      | "1001"      |
|            | Vata config        | "0111"      |
|            | STiC config        | "1100"      |
|            | Mode config        | $``1000$ "  |
|            | run/stop           | "1010"      |
|            | Knock-knock        | ``1011"     |
| FPGA to PC | ACK                | "1000 0000" |
|            | Send data          | 41104       |

<span id="page-81-1"></span>Table 5.1: The instruction command types.

The component *Evaluador* is a complex finite state machine. In the initial state it checks for a new data packet from the UDP core. When the data valid flag is asserted the machine goes to the next state, and a check follows for the correct header code. If these bytes are not present, the packet is rejected and an ACK packet with an error indicator is generated. If these fields are present, the FSM continues to the next state where in which is recognized the operational code or the instruction within the packet. When the field associated to the operation code (COP) is read, three paths are possible:

- $\Diamond$  Knock-knock packet is received In this case, the pre-sum for the UDP (User datagram protocol) packet must be calculated taking into account the IP of the PC. This pre-sum is written in the UDP LUT (Lookup table) and then an acknowledge packet is built and sent to the PC.
- $\Diamond$  ASIC configuration when an ASIC configuration packet is received, the bits for the slow control are stored in a *confAsic* FIFO. A flag is generated to activate the ASIC config block which sends the configuration to the corresponding ASIC. The length of the configuration register changes for each different ASIC.
- $\Diamond$  The regular configuration packet the parameters of the packet are read and control signals which are generated to activate the corresponding section of the firmware to execute the command.
	- When the operational code is the *run* command the  $ASIC$  *readout* block is activated. This block waits for a trigger signal, and according to the configured running mode generates the readoout signals for the chip. In case of the normal readout mode a 64 ADC data samples are stored. In scope mode 256 consecutive ADC samples are stored. The *SuperFIFO* block and the TDC trigger signal are also activated with the trigger signal: In the SuperFIFO the ADC data is combined with the TDC data, being the final data packet formed and prepared to be sent once the UDP core is ready to send the data.

In all the cases, the packet has to be finished with  $'ACAB'$ . If this does not happen, the packet is rejected, an  $ACK$  error is generated and the  $confAsic$  FIFO is erased, returning the FSM to the initial state.

Figure [5.3](#page-83-0) shows a simulation of the component *Evaluador*. The user\_data signal shows the data word sent from the PC. In this case, the knock-knock command starts with the hexadecimal header "CAFE" and finishes with the tail "ACAB". The package is empty when addressing all the hybrid boards. The signal *presente* is used to identify the different FSM states. Once the word is recognized, then the same state machine generates the answer, which is saved first in a memory. This is shown in the second part of the state machine. The resulting word, only in this case, is 12 bytes with the same header and tail. The inner part consist of "FF" which is the same ID as the one received; the "8B" is the reply of the command "0B", just replacing the "0" with "8"; the "00" is the hybrid ID, in this case the first one or number "0"; the second "00" is the error code or no errors are detected and the words are correctly received; the "02" is the ASIC type - in this case Vata64hdr16; the last three bytes are "020203" which correspond to the release of the firmware. The component Evaluador is in charge to recognize the received command words and to generate the answers to the PC informing for the correct reception and send the parameters to the corresponding components for configuration. The data packet is generated in the superFIFO component and it is explained in the Section [5.2.8.](#page-88-0)



<span id="page-83-0"></span>Figure 5.3: Simulation print screen of the FSM implemented in Evaluador.

### 5.2.2 Bias and Mbias programming

The component *Bias conf* is in charge of the programming of the DACs used to generate the threshold voltage and the additional biases needed for the configuration of the ASIC. A dedicated command from the software specifies the type of bias, the target ASIC and the desired voltage value. The component will calculate the corresponding DAC and the corresponding channel and will generate the required SPI signals to program the DAC.

### 5.2.3 ASIC configuration

The Vata64hdr16 ASIC is configured through a 872 bits register. This process is done by the component ASIC conf. This component consists of a finite state machine which generates the clock signal -  $\alpha$ kin and the data signal - regin. Each clock cycle a data bit is provided to the input of the ASIC. The exact value of the 872 bits register is generated from the software by each ASIC. After the reception from the board of a dedicated command, containing the register data, the component Evaluador writes this data into a FIFO memory. The FSM reads bytes from the FIFO, output the data in serial bits and then reads another byte. Once all bits are sent, the ASIC is configured. For the STiC ASIC the configuration register is 4657 bits long and follow the same procedure as for the Vata64hdr16 ASIC.

#### 5.2.4 ASICs readout

#### Vata64hdr16 readout principals

The component in charge of reading the ASICs is *ASIC readout*. A specific chronogram is needed to readout the ASIC depending on the operation mode. Vata64hdr16 has two operating modes: serial and scope. The component ASIC readout has a FSM inside. When a trigger is registered, the FSM starts the sequence. The FSM is configured according to the chosen run mode before the receiving of the trigger signal. There are four component instantiated, one for each hybrid board.

The differences between the acquisition modes are explained in the following sections.

#### Serial readout mode

Figure [5.4](#page-84-0) shows a simulation from the *GPSignals* block for the serial read out mode. In this mode, after receiving the trigger signal a counter counts the integration time. After that, sample and hold  $(sh)$  signal and shift  $(shit_in.d)$  signals are generated. The  $(\textit{shift-in-d})$  signal is only one clock cycle and then with every clock  $(qck)$  it is shifted inside the ASIC to the the output buffers of each channel. A  $64$  *qck* clock has to be generated to cover all the channels. For simplicity the simulation shows only a few clock cycles. With each clock the ASIC will provide at the analog output the registered charge of the corresponding channel as an amplitude. When all the channels are read out, a reset (res) signal is generated to reset the ASIC shift register. No address information is generated, because of the sequential channel read out. With every  $qck$ clock a flag is generated to pick the corresponding analog data from the ADC block and stored in a dedicated FIFO memory.



<span id="page-84-0"></span>Figure 5.4: Simulation snapshot of the readout signals sequences for the **serial** read out mode

#### Scope readout mode

The scope readout mode is mainly used for test purposes. In this mode the analog output is connected to only one channel at a time. The channel can be configured with or without sample and hold unit and is possible to observe the analog shape of the signal just before the sampling. For this purpose the bit indicating the scope mode has to be enabled and all the channels have to be disabled for receiving the triggers except the channel under test. This is done through the slow control configuration register. The next step is to provide one pulse to the  $\textit{shift\_in\_d}$  signal. This writes a '1' into the first bit of the shift register of the ASIC and connects the channel  $\partial$  to the output buffer. Clocking the gck signal allowis this bit to be moved to the channel chosen for test. Within those conditions if a test pulse is provided to the ASIC through the cali input, a trigger will be generated from the ASIC. No more control signals are sent to the ASIC from the firmware, because it is already configured and we want to keep this configuration. After a certain delay the analog pulse is presented on the output of the ASIC and its Gaussian shape can be observed. In this mode, 256 consequently samples are taken from the ADC manager component. Those samples are enough for the reconstruction of the pulse shape later by the software.

#### STiC readout principals

The STiC is a free running chip and generates data-frame outputs continuously. It doesn't generate a trigger output and if there is no input the output frames are empty. The data transmission of the events uses a 160MHz serial data link. The data link is encoded using 8b/10b encoding [\[Wid\]](#page-117-0), which maps the byte from 8 to 10 bit symbols. The mapping is needed to achieve DC balance of the signal or equal number of  $\theta$ and '1'. The encoding also provides a way to detect errors in the data transmission. Invalid 10 bit symbols can be recognized from the DAQ, indicating a bad data link. The encoding defines also 12 control symbols, which can be used to control the data transmission. An important subset of these symbols are the comma symbols, which are used to reconstruct the bit alignment within the serial data stream. Every 1024 clock cycles at 160MHz the transmission of the events stored in the internal memory in the chip is initiated . Every data frame consists of the following parts:

- $\diamond$  the header control symbol, which indicates the start of a new frame,
- $\Diamond$  the frame number an 8-bit sequential number,
- $\diamond$  the data every event has 47 bits as shown in Table [5.2,](#page-85-0)
- $\Diamond$  the trailer control symbol, indicating the end of the event data,
- $\Diamond$  the number of events transmitted in this data frame.

Between the individual frames, a comma-symbol is transmitted to allow the DAQ to recover the data transmission clock and to align the sampled bytes in the data stream.



<span id="page-85-0"></span>Table 5.2: Event data format.

#### 5.2.5 Trigger manager

The Built-in module Trigger manager in the FPGA controls the acquisition trigger mode. The trigger block is the one which gives the *start* signal for the acquisition process. There are four trigger modes:

- $\Diamond$  The normal trigger mode- a trigger signal coming from the hybrid board starts the acquisition process for that board. In this mode the readout of the boards is independent for each one.
- $\Diamond$  The *coincidence trigger mode* this trigger mode considers the condition of receiving at least two triggers in a coincidence windows of 6-10ns time. It is used to make a coincidence between the hybrid boards.
- $\Diamond$  The *software trigger mode* a counter generates a signal every N<sub>µs</sub>. This trigger can start the acquisition for any of the four DB or for all at the same time. It is used when a pedestal acquisition is executed and in some test procedures.
- $\Diamond$  The external trigger mode an external trigger starts the acquisition process. This mode is used when the system is synchronized with another device or board.

The trigger type mode is selected by the user through the software program and the PC sends a specific command word to the board. The firmware evaluates the received commands and configure the Trigger manager. The data transmission is started/stopped if a second run/stop command is received. The trigger inputs are ignored during the configuration of the ASIC in order to avoid the reception of false triggers. Each incoming trigger is time-stamped with the TDC implemented into the FPGA, used later-on for matching the coincidence events in the reconstruction process.

### 5.2.6 ADC manager

The component *ADC manager* receives the data from the ADC, deserializes it and separates the samples by channels. The ADC is a high speed free sampling converter. The design of the component follows the instruction given in Xilinx application note XAPP774 (How to connect the Xilinx FPGA to the fast ADC) [\[XAP\]](#page-116-1).In the following paragraphs the details of the design are explained.

#### Implementation description

The ADC has eight LVDS outputs, each one provides a serial 12-bit data bitstream. The ADC receives also an LVDS high-speed clock output (DCO) and an LVDS frame clock (ADCLK). The DCO, is six times the ADCLK frame clock. The time diagram between the different clocks and the output data was shown in Figure [4.7](#page-55-0) of Chapter [4.](#page-44-0)

For this particular design the ADC specifications are: ADCLK is 16MHz (62.5ns) and DCO is  $96MHz$  (10.4ns). The data is aligned in DDR format, meaning that it must be clocked into the receiver registers on the rising and falling edges of the receiver clock. One of the techniques adopted in this case is using for example the component from Xilinx so called "Digital Clock manager" (DCM). A DCM receives the DCO clock and provides two output clocks; one is a copy of the input clock ( $Clock0$ ) and the other is 180°, phase shifted (Clock180) respect to the first clock. Now the rising edge of both clocks can be used at the flip-flops.

Figure [5.5](#page-87-0) shows a single-channel receiver module. This module takes in the serial differential data of one channel and sends it to the output, internally to the FPGA, as 12-bit parallel data. This module is used eight times for an 8-channel ADC device. The 12 incoming serial data bits of one channel are split into two sections. The even bits are clocked on the  $Clock0$  clock, and the odd bits are clocked on the clock  $Clock180$ . When a strobe pulse is detected, these serial registered bits are stored in a parallel register. The result is a 12-bit parallel word with a jumbled data bit order. The data bits are ordered into the Data Multiplexer. Then the data is stored into a 12 bit register.



<span id="page-87-0"></span>Figure 5.5: 12-bit Single-Channel Receiver. (Figure from XAPP744.)

To process the data from the ADC, 8 receiver blocks are implemented into the firmware. The ADC is continuously working and sampling the analog data in each of the 8 differential channels. When a sample is ready to be read from the ADC component, a write enable  $(EnaRAM)$  signal is generated. This application, only needs to read the samples that contain data from the ASIC. This data is coming synchronously with the readout sequence of the ASIC and therefore the right moment for saving the sample of interest can be calculated . Notice that the ADC generates samples with a frequency of 16MHz, and the ASIC readout rate is 1MHz. We take 1 ADC sample per channel of the ASIC in a normal readout mode and for the scope mode 256 consecutive samples in order to reconstruct the shape of the signal. The samples are stored into dedicated FIFO memories.

#### 5.2.7 Implemented Memory

For each hybrid board two FIFO memories are used for the analog data and one bigger FIFO, so called *SuperFIFO*, which saves the data in the final format, ready to be send. A FSM is dedicated to fill-in the SuperFIFO with encapsulated data. Additional fields are added like the identification number of the hybrid board, the running mode, the size of the packet and others. The FIFOs memories have programmable full and empty flags so that they show a full state flag if they haven't enough space for a new set of data. The thresholds for the empty and full flags are set dynamically depending on the ASIC, the readout mode and other parameters. The packet structure and size vary depends on the read out mode. The following subsection describes the implemented packets options.

#### <span id="page-88-0"></span>5.2.8 Data packet formats

The packet formats will depend mainly on the readout mode. Figure [5.6](#page-89-0) shows the implemented packages formats. All packages start with the hexadecimal code 'CAFE' and finish with the code 'ACAB'. The second 16 bits are dedicated to identification of the board  $(ID)$  and the operational code  $(COP)$ . In the case of the normal data packet, Figure [5.6a](#page-89-0)), the next 2 bytes are for the size of the packet. Depending on the read out mode of the ASIC, the size of the packet is calculated and served into the SizeH and SizeL fields. Then a 8 bits field follows, which is used for the time stamps or the TDC data. There is 1 byte for the hybrid board identification and 1 byte for the readout mode. The difference between the packages comes in the next bytes. In normal mode the data is 128 bytes and for scope mode is 512 bytes. The other two possible frames are shown in Figure [5.6b](#page-89-0)). One of them is the normal acknowledgment frame generated from the FPGA in reply of receiving a configuration command. This frame has only 8 bytes including the hybrid ID, the error type, the ASIC type and one empty byte. The other possible frame is the knock-knock frame. The difference is only in the operation code, in which the firmware release is included. This packet is sent from time to time, whenever the PC asks for it, mainly to check the state of the communication link between PC and the HDRDAQ board.

### 5.2.9 TDC

A time to digital converter (TDC) is a device that provides a digital representation of certain amount of time. In other words, a TDC registers the time of arrival for each incoming pulse. TDCs in general are digital components in nature and they can be



(a) Normal and scope readout packet. (b) Normal acknowledgment packet (up) and knock-knock replay packet (down).

<span id="page-89-0"></span>Figure 5.6: Output data frames.

implemented in a full-digital style. Thus, TDC are good candidates for implementation in a commercial FPGA. The simplest way of implementing a TDC is to use a highfrequency counter, whose value is incremented at each clock cycle. When an event occurs, the accumulated amount of clock periods are stored and with the known clock frequency the time can be calculated . The drawback of this approach is that the resolution is limited to the clock frequency. Moreover a faster clock is required and the stability of the clock system becomes critical.

A popular method for TDC implementation is the use of a counter based technique, where the clock frequency is not too high, used for coarse measurements and generally achieving nanosecond resolution. For the interval inside the coarse counter or the fine measurement, several techniques exist to obtain exact time interval value. One of the them, for example, is the Vernier method that uses two oscillators slightly out of tune [\[Kin\]](#page-117-1) or two tapped delay lines with slightly different delay [\[And\]](#page-117-2). Figure [5.7](#page-90-0) represent a possible delay line structure. The START signal is propagated through a delay line, implemented for example with buffers (minimum propagation delay cells). At the arrival time of the STOP signal, the propagated START signal is latched. This gives directly a thermometer time code. The number of stages flipped to one gives the timing information. The resolution of the TDC is given by the buffer propagation time  $\tau$ , which in Spartan6 FPGA is around 15-25ps.

The implementation of the delay line into a FPGA is normally done using a carry chain, because it is the only structure with a dedicated routing path and also with the smallest delay. The limitation for this structure comes from the FPGA slice structure



<span id="page-90-0"></span>Figure 5.7: Sketch of a TDC delay line structure.

itself and non-uniform delay inter and intra carry slices. Another specific feature of the FPGA TDC is its large differential non linearity (DNL) which is represented as a large variation of the apparent width of each TDC bin. The most significant origin of DNL is the logic array block (LAB) structure. When the input signal in the carry chain passes across the LAB boundaries, the extra delay added cause periodic wide bind.



<span id="page-90-1"></span>Figure 5.8: Principals of the TDC sampling method.

The TDC implemented in the HDRDAQ board consists of a free running coarse counter, a tapped delay line implemented in a carry chain, encoder, reset logic and readout logic. The carry chain exists already in the FPGA, therefore manual placements are not necessary for building the delay structure. This generation can be automated and be placed on the desired location. Manually specifying a starting X, Y location for the delay line ensures best linearity and resolution and the behavior becomes stable and always the same after each compile run. 64 carry logic blocks are instantiated, each of them has 4 bits or 4 delay elements, which give in total 256 delay elements with average delay, for Spartan<sub>6</sub>, around 20ps. This gives 5.12ns propagation time for the fine counter. The coarse counter clock run at 250MHz or 4ns period which is completely covered from the fine counter propagation time. Figure [5.8](#page-90-1) shows the basic principles of the method. The hit occurs sometimes within the coarse counter period and starts to propagate through the delay line. At the next rising edge of the coarse counter clock, the registers of the delay line are sampled and the coarse counter value are stored. The real time stamp is calculated by:

$$
T = T \text{coarse} - \Delta t \tag{5.1}
$$

The delay line generates a thermometric code, and an encoder is implemented to convert the code to a binary value. The encoder must be "bubble proof". In the ideal case, the 0 to 1 transition recorded by the register array are clean thermometer code, like 000001111. However, "bubbles" at the transition edges like 00010111 may happen due to uneven propagation delay in the FPGA structure. The encoder should be designed to output a reasonable value even when the transition edge bubbles occur.

For the generation of the coarse counter clock, the Digital Clock Manager (DCM) form Xilinx is employed. This component uses a 50MHz source clock, multiplies for the required factor to provide 250MHz clock. This clock is then fun-out through a Phase Locked Loop (PLL) component from Xilinx to the four TDC instantiated in the design. This provides more stable clock frequency and synchronized clock for the different TDCs. The result of the test of the TDC are shown in the Chapter [7.](#page-98-0)

#### 5.2.10 The Ethernet Communications

The Ethernet communication is handled by an IP Core from Xilinx [\[Xip\]](#page-116-2) that performs the first Ethernet layer. The NET layer is done by an IP Core from OpenCores.org. and is dedicated to the UDP communication [\[Sta\]](#page-117-3). The UDP core has a look up table (LUT) where the constant fields are pre-recorded. The ID of the board is written for each board as well as the IP and Media access control (MAC) of the PC when it sends a knock-knock command. It includes also another LUT in order to send an Address Resolution Protocol (ARP) packet to the PC when a knock-knock command is received. It allows the PC to update the ARP table and then it sends commands directly to the identified board.

The component  $LUT$  writer is in charge of writing in the UDP LUT and the ARP LUT static fields that depend on the ID of the board or the MAC and IP of the computer. There are two moments in which these LUTs are written. The first one is in the startup when the LUT writer puts the ID of the board in the fields corresponding to the less significant byte of the MAC and IP addresses of the board. Figure 5.9 represents the LUT's memory. The positions in green correspond to the mentioned above. The second write operation happens when a knock-knock packet is received. The MAC and IP of the computer are captured by the MAC-UDP cores. This information is written in the corresponding fields in orange in Figure [5.9.](#page-92-0)

When a command is received from the FPGA it goes to the MAC core. This layer eliminates the preambles and checks the Frame Check Sequence (FCS) field. Then it evaluates if the destination MAC matches the ID of the board or if the destination MAC is a broadcast MAC. If not, the packet is rejected. If the destination is the



<span id="page-92-0"></span>Figure 5.9: Field of the LUT for the ARP and UDP.

board, the packet passes to the UDP layer. This layer passes through the UDP fields and extracts the UDP information. When this information is ready, the *data\_valid* signal is asserted by the core. The component Evaluador receives this signal and the data from the UDP packet. Notice that this data has been previously stored in a FIFO in the MAC core. Evaluador will perform the command management.

When a packet is received, the component Evaluador generates a response (ACK) packet) informing that the packet has been received and understood. This ACK is written in a FIFO memory. As it was previously mentioned, when a knock-knock command is received, an ARP packet is also generated. In this case, the packet is not written in any FIFO. The packet is pre-build in the LUT. A flip-flop is used to inform that an ARP packet is ready to the process in the top part of the firmware program. The set input is connected to Evaluador and the reset signal is generated in the sender process. The transmission is started in the top program and is only available if a flag of the UDP is active. In this case, a transmission may start due to: an ACK packet is ready, an ARP packet is ready or a data packet is ready. In order to know if a data packet is ready, the empty flags from the SuperFIFOs for each slot are checked. The order of the check process rotates in order to equal the priority of all the slots. Every time the transmission is initiated the UDP core starts working, sending

the headers. When the appropiate time for user data arrives, the FIFO for the ACKs or the SuperFIFOs are read.

### Conclusions

The firmware implemented into the FPGA was intended to be as flexible as possible and to consider the all multiple options of possible configurations of the system. It is under continuously update for optimizing the data transfer efficiency, the TDC resolution and other aspects.

# Chapter 6

# The software

The HDRDAQ boards are controlled online using a PC with dedicated software based on the DAQ++. DAQ++ is a  $C++$  framework for developing data acquisition software [\[CLA\]](#page-114-0). The design is fully object oriented and provides a hierarchy of objects that allow a full control of the acquisition system and, also, on-line monitoring and storage of data.

The DAQ software, called Vdaq, provides a Graphical User Interface (GUI) for the different control objects. Vdaq has been built in a modular way which allows adding dynamically user defined modules, data receivers or RunManagers that allow defining different data acquisition modes, like normal acquisition run or parameter scanning. DAQ++ also provides a number of tools to monitor, transfer and store the data. The application is general enough to allow loading dynamically libraries containing the implementations of different Modules and RunManagers. The application is driven by an XML (Extensible Markup Language) configuration file where the locations of the libraries and the running parameters are defined. It also provides a graphical interface to edit those configuration parameters. The following paragraphs explain the different characteristics and parameters of the Vdaq program configured for the current setup.

Figure [6.1](#page-95-0) shows the main window of the Vdaq program with an example where two hybrid boards corresponding to the modules 31 and 33 are controlled. The user specifies the number boards and on which connector (port) are connected. In the main software window, the users can do the following:

- $\diamond$  Control the acquisition process.
- $\diamond$  Configuration of the ASIC.
- $\diamond$  Specify the normal, pedestal run or scan run mode.
- $\diamond$  Start, stop and paused the run process.
- Upload pedestals from previous runs.
- $\diamond$  Specify the output data file.
- $\Diamond$  Specify the number of maximum events received.
- $\diamond$  Monitor the actual number of events received.
- $\diamond$  Monitor the time passed from the beginning of the run.
- $\diamond$  Monitor the rate of the acquisition process.
- $\diamond$  Saving the acquired data in a file.



<span id="page-95-0"></span>Figure 6.1: The main window of the Vdaq program

Figure [6.2](#page-96-0) shows the configuration windows of the module 31. In these windows user can specify:

- The ASIC type.
- $\diamond$  The readout mode (*Serial* or *Scope*).
- $\diamond$  The threshold level.
- $\Diamond$  The trigger type.
- $\diamond$  The biases levels (Mbias, VFP and others).
- The hold delay value (the SH signal can be delayed or advanced).

Many configuration parameters address the hybrid boards itself and they are common for all ASIC on the board, in case there is more than one ASIC, like the threshold, the running mode or the bias configuration. There are also parameters which address each ASIC separately. These are available in the second window shown in Figure [6.3.](#page-97-0)

- $\Diamond$  Configure the ID of the ASIC.
- $\Diamond$  Select the polarity of the input signal (this depends of the type of the detector used).
- Activate the test mode run.
- Enable or disable channels for acquisition run.
- $\diamond$  Select the test channel when test mode selected.
- $\Diamond$  Select the test mode type (internal or external generated test pulse).
- $\circ$  Configure the amplitude of the internal generated test pulse.
- Configure the analog output data type (voltage or current).
- $\Diamond$  Configure the use (or not) of the internal peak and hold module.
- $\diamond$  Configure the gain type (low or high).
- $\Diamond$  Configure specific bits of the configuration register.

Once all the configuration parameters are selected, the 'Apply' button sends the corresponding commands for the configuration of the HDRDAQ board as specified and also the configuration register world for each ASIC.



<span id="page-96-0"></span>Figure 6.2: The general settings window

With the *Vdaq* program the received data can be monitored during the acquisition process and saved on a file for later offline analysis.



<span id="page-97-0"></span>Figure 6.3: The ASIC settings window

# <span id="page-98-0"></span>Chapter 7

# Hardware tests

The HDRDAQ board was manufactured and the first tests were focused on the hardware part. The power supply and the basics functions of the board are verified. The next step was to check the functionality of the firmware and with one hybrid board and the Vata64hdr16 ASIC. These tests show the ability of the system to control the ASIC and the acquisition process. This chapter presents the results obtained during the test of the system. The chapter includes also characterization tests of the STiC chip, concluding that it is also a good candidate for the front-end electronics of the Petete scanner.

## 7.1 Hardware test

The hardware part of the system is the HDRDAQ board, shown in Figure [7.1.](#page-99-0) The first test performed is the verification of the absence of shorts on the board, ensuring that all different voltages levels are correctly generated. Special attention was given to the noise on the analog power supply line, which is considered of extreme importance. The amplification stage is connected to separate analogue ground, and therefore any fluctuation or noise in the analog ground would affect directly the input signal, generating noise in the system. The measured noise for the  $\pm 5V$  was less than  $5mV_{pp}$ and for the digital part of the board is less than  $15mV_{pp}$ .

The next test is the programming of the FPGA and the flash memory. The FPGA was programmed with very simple firmware which makes one led to blink. This worked fine programming directly the FPGA and also using the flash memory. Then a functional test is performed to check the signal conversion in the different functional parts. All signals lines going to the ASIC were checked, verifying that their output provided the expected logic levels. Also all DACs were programmed and the generated voltage levels at all DACs outputs and analog switches were measured. The results of these test are shown in the following section.



Figure 7.1: HDRDAQ board.

## 7.1.1 Trigger test

The trigger signal form Vata64hdr16 chip is a current type logic signal. It is converted to a voltage signal, amplified and converted to LVDS type so the FPGA can recognize it. Figure [7.2](#page-99-1) shows a print screen from scope measurements of the trigger signal measured after each stage. Channel 3 is the output of the amplifier and channel 2 is the positive branch of the LVDS output, confirming the conversational process.

<span id="page-99-0"></span>

<span id="page-99-1"></span>Figure 7.2: Scope print screen of the trigger signal after the different stages. The green signal  $(ch4)$  is the applied test pulse to the ASIC input. The pink signal  $(ch3)$ is the trigger signal from the ASIC after the amplifying stage. The blue signal (ch2) is the positive output of the resulting LVDS signal

#### 7.1.2 Biases configuration

After the connection of the hybrid board to HDRDAQ the first step was to configure the Mbias bias voltage. The corresponding DAC was programmed and the 3.5V through 5kΩ resistor which gives the 700μA was measured. The *Vthr* and *VFP* biases were correctly set up and the rest of the biases were disabled because they are not required for the ASIC operation and only in case of requirements of a special settings.

#### 7.1.3 Readout control signal

Figure [7.3](#page-100-0) shows an oscilloscope measurement of the generated control signals *qck* and *shift in d* for the readout process of the ASIC. The voltage levels of the signals correspond to  $\pm 2.5V$ , which matches with the ASIC logic levels requirements. The  $shift_in_d$  signal embark the first falling edge of the *qck*, and later inside the ASIC it is shifted with every  $qck$  to the next channel. The hardware fulfill it function to create the specific signal for the readout process of the chip.



Figure 7.3: Oscilloscope measurement of the slow control signal sequence.

### <span id="page-100-0"></span>7.1.4 Analog amplifiers test

Figure [7.4](#page-101-0) shows the trigger signal (the blue channel) and the analog output signal (the yellow channel) in scope mode measured after the amplifier stage on the HDRDAQ board. In this readout mode the shape of the analog signal generated inside the ASIC without the sample and hold stage can be observed. This mode is used to set the correct time delay for the external applied sample and hold signal respect to the trigger signal. The integration time of the chip can be configured through the slow control configuration register. The observed signal in Figure [7.4](#page-101-0) is only the positive part of the differential output of the amplifier. The coupled sinusoidal noise observed is due

to external noise coupled to the probe and does not appear in the normal acquisition via the onboard ADCs.



<span id="page-101-0"></span>Figure 7.4: Oscilloscope measurement trigger and the analog output signal (LVDS) in scope readout mode.

The conclusion of these measurements is that the ASIC is correctly configured in terms of biases and configuration register. The readout sequences are generated properly in the different readout modes and the response of the ASIC is as expected.

## 7.2 TDC characterization

The functional test of the TDC is performed with a test pulse or hit signal fed to the FPGA through a clock input SMA connector. The hit signal goes directly to the first TDC and pass through a buffer before reaching the second TDC. The time difference between the signals is measured and the delay line is characterized.

The errors in the fine-time measurements that arise from the variations in the cell delays are characterized by the differential and integral non-linearity of the fine counter. The differential non-linearity (DNL) of a fine counter bin is defined as the relative deviation of the cell delay  $\tau_i$  from the average delay:

$$
DNL_i = \frac{\tau_i - \langle \tau_d \rangle}{\langle \tau_d \rangle} \tag{7.1}
$$

The integral non-linearity (INL) value of a cell is its total deviation from the correct fine time value and can be obtained by the summation of the DNL values of all cells prior to the investigated cell.

<span id="page-102-0"></span>
$$
INL_i = \sum_{j=1}^{i} DNL_i
$$
\n
$$
(7.2)
$$

The DNL and INL are typically expressed in least significant bits (LSBs), which is the smallest quantization step of the digitization process.

#### 7.2.1 Fine counter Non-Linearities

In order to evaluate the differential and integral non-linearities of the delay elements, the delays of the individual fine counter bins have to be measured by a so-called code density test (CDT) [\[Kal\]](#page-118-0), which is graphically illustrated in Figure [7.5a](#page-103-0). In this test, a defined number of trigger events are generated with a uniform distribution over the coarse counter period. The number of events recorded in a single fine counter bin is proportional to its propagation delay [\[Mot\]](#page-118-1). The expected average number of events  $n$  per fine counter bin is given by the *avg* ratio between the number of available bins and the total number of generated events this, the differential non-linearity of time bin i can be calculated from the CDT by.

$$
DNL_i = \frac{n_i}{n_{avg}} - 1,\t\t(7.3)
$$

where  $ni$  is the number of recorded events in this time bin. With the known DNL values of  $i$  the bins, the corresponding INL is given by Equation [7.2.](#page-102-0)

Several methods exist to correct the non-linearities of a TDC, like the use of lookup tables for the correction of the INL or the remapping of the non-uniform time bins to a uniform distribution [\[Fav\]](#page-118-2). The second method is applied in this measurements. From the known bin sizes of the delay elements, a statistical mapping is defined between the fine counter bins and an uniform distribution of the bins, as it is shown in Figure [7.5b](#page-103-0). The probability values of the mapping are calculated from the overlap of the time intervals covered by the real fine bins and the ideal case. When mapping the fine counter to a uniform distribution with the same number of bins, the more accurate time information provided by cells with a short propagation delay is lost during the mapping. In the example of Figure [7.5b](#page-103-0) this is shown for the two small fine counter bins which are mapped to the same bin in the uniform distribution. In order to retain this information and improve the resolution of the measurement, the fine counter can be mapped to a uniform distribution with smaller bin sizes.

Figure [5.7](#page-90-0) shows the measured real bin delay from the code density test for the two TDC channels. The mean value for both channels is around 30ps and the second channel has one delay element with large delay of 80ps.With more investigation, identifying the cell





<span id="page-103-0"></span>(b) Pseudo-random bin dithering for non-linearity correction of the fine counter

Figure 7.5: Determination and correction of the fine counter non-linearities [\[Har\]](#page-115-0).



Figure 7.6: Real bin delay calculated from statistical density code for TDC channel 1 (a) and channel  $2(b)$ .

number, it is possible this delay to be reduced by replacement. Every carry logic block has 4 delay elements. The routing inside the block is not equal between blocks and this is where the difference in the delay between the cells come. It was observed that using only 2 delay elements per carry block the delay uniformity improved. Figure [7.7](#page-105-0) shows the differential non-linearities of the two TDC channels using the code density test after calibration. The DNL is less than 1LSB except for the cell with very large delay.



<span id="page-105-0"></span>Figure 7.7: DNL plot for TDC channel 1 (a) and channel 2 (b).

Figure [7.8](#page-106-0) shows measured time difference between the two TDC channels. The obtained resolution is around 16ps. This measurement is done in more or less ideal condition using a clock input for the hit signal. Being the same input signal for both TDCs only a buffer can be the source of the delay between the hit signals for the TDCs. However, even if this resolution becomes 10 times worse with the real hit signal, the resolution still would be enough for the proposed application.



<span id="page-106-0"></span>Figure 7.8: Time difference between the two TDC channels.

## 7.3 STiC test

The STiC ASIC is designed to provide a very high timing resolution for the readout of SiPMs in coincidence measurements for time-of-flight applications. To test the performance of the chip a setup has been prepared as shown in Figure [7.9.](#page-106-1) Coincident photons from positron annihilations from <sup>22</sup>Na source have been measured to determine the energy resolution and the Coincidence Time Resolution of the system. The 511keV photons generated by the positron annihilation are detected by the sensor - 4x4 MPPC arrays from Hamamatsu glued to a 3.1x3.1x15mm LYSO:CE crystals. An external power supply with two separate high voltage outputs provides the bias voltages for the SiPMs, allowing to tune the SiPM bias voltage sensors individually. The two sensors are connected to channels 20 and 33 of the STiC3, which are located on the opposite side of the chip. In this case the two TDCs are synchronized using the same reference clock. The SiPMs performances are very dependent from the temperature so the setup is placed in a temperature controlled oven at 18 ◦C to prevent temperature fluctuation during measurements. The data from the chip include the energy and time of the hits.



<span id="page-106-1"></span>Figure 7.9: STiC setup for measurements of coincidence 511 keV photon annihilation.

Figure [7.10](#page-107-0) shows a energy spectrum of  $^{22}$ Na source with STiC using the Time-over-Threshold method. The  $^{22}$ Na produce predominantly two photopeaks: 511keV and 1.275MeV. The two photopeaks can be observed simultaneously with a good resolution. The measured ToT response is not fully linear. This is mainly caused by the limited dynamic range of the SiPM.



Figure 7.10: Energy spectrum recorded with STiC using the ToT method.

<span id="page-107-0"></span>

Figure 7.11: Measurements of the coincidence time resolution between two channels.

### 7.3.1 Coincidence time resolution

Coincidence time resolution measurements are performed with the data form the two channels. The recorded energy data is used to select the events corresponding to the photonpeak of 511keV. The Compton scattered events are rejected being selected only photons which have been fully absorbed in the scintillator and with energy within  $\pm 1.5\sigma$  of the 511keV photopeak are selected. The Coincidence Time Resolution (CTR)
is obtained by measuring the time differences of the previously selected events from the energy spectrum. The sensor generates a current pulse with a slope which directly depends on the applied overvoltage. A higher bias voltage will increase the slope of the pulses and thus will improve the resolution. But it will increase the DCR or the noise. An optimal operation point has to be found. For the purpose, the high voltage for the SiPMs has been optimized in two steps. First, the bias for the channel 33 has been fixed to 66.3V and only the bias voltage of the channel 20 is varied until the best time resolution is obtained. Fixing the settings for the channel 20, the second step is to find the best configuration for the bias voltage for the channel 33 too. To evaluate the time resolution, the data is fitted with a Gaussian function. After the tuning of the high voltage and the threshold settings the best time resolution obtained is 213ps FWHM, as shown in Figure [7.11.](#page-107-0)

The STiC measurements confirm that the chip is also a good candidate for the frontend electronic of the Petete scanner. It provides good enough energy resolution to obtain a clear 511keV photopeak and a excellent coincidence time resolution. Its only drawback is its high power consumption that lead to the design of a cooling system for the scanner. A cooling system will also benefits the sensor to keep it at low and stable temperature, but make the design of the scanner more complex. On the other hand, the STiC chip provides a digital output, and this relieves considerably the amount of back-end electronics for the acquisition. The HDRDAQ board has been designed to work with both ASICs (STiC and Vata64hdr16) mainly to give the opportunity to evaluate both chips before building the final prototype of the scanner.

## Chapter 8

## Conclusions

The main motivation for the project arises from the need to improve the threedimensional (3D) medical imaging technologies, in order to reduce the patient radiation exposure and to increase the image quality by optimizing the spatial and temporal resolution of the scanners. This is achieved with the introduction of time-of-flight information into the image reconstruction. This method adds precision in the data analysis for the identification of the coincidence events, allowing to discard background noise. As a consequence, the signal to noise ratio (SNR) increases and therefore the medical image quality is improved.

This work describes the development of a data acquisition system for silicon photomultipliers for a full-ring small animal PET scanner (Petete). The scanner consists of 16 detector heads, placed on a custom made hybrid boards together with the SiPMs detectors and the front-end electronics, which is an application-specific integrated circuit (ASIC). The large amount of channels (1024) motivates the necessity of a compact system and demanded the development of an integrated data acquisition system for the PET scanner. In the presented work, the construction of a fully integrated DAQ system (hardware, software and firmware) is successfully accomplished, fully covering the specific requirements of the Petete scanner. The DAQ board is designed to work with two different ASICs in terms to leave the possibility to test and evaluated the recent developed STiC chip. It also provides features such us the possibility to readout the detectors heads in parallel and measure the time-of-flight of the recorded events with a very good timing resolution.

The hardware part of the system features a very low noise and good signal integrity for the analogue signal treatment, which allows achieving very good energy resolution, without the intrinsic resolution of the silicon photomultipliers and the readout ASICs, obtaining values of 213.6ps FWHM coincidence time resolution with the STiC chip. These values are obtained with  $4x4$  MPPC arrays  $(S12643-050CN(X))$  from Hamamatsu glued to a 3.1x3.1x15mm LYSO:CE crystals, being one of the best recent values obtained using this type of sensors.

The present work includes ASIC development, in particular part of the digital electronics and the back-end design of the STiC ASIC. This work has been accomplished with very good results, as commented in the previous paragraph, and also opens the possibilities to work in new cutting-edge Front-end chips in the future. The software and firmware implemented for the system completely satisfies the requirements, constituting a multi-configurable system with fast data transmission via Gigabit Ethernet. Different possible scenarios can be configured, such us different readout modes, different test options, and independent configuration for each hybrid board. The experimental tests carried out verify the correct functional behavior of all subsystems as it is explained in the present memory. The system will be used for research as well as in laboratory setups for test and characterization of a new silicon sensors and scintillators. For this purpose the system is compatible and easy re-configurable to work with different detector heads.

At the moment the HDRDAQ board is operated with two hybrid boards with 64 channels each. The full system test with four detector heads is planned for the near future. Further steps will include the use of several HDRDAQ boards to completely cover the number of detector heads.

# Acknowledgments

I would like to say thanks to the people without whom this work wouldn't have been possible:

To Dr. Carlos Lacasta and Dra. Gabriela Llosa, for giving me the opportunity to take part of the medical physic group at IFIC, for their teaching, patience and support during the years.

To Dr. Vicente Gonzalez, for his time and patience to attend my question and long discussions, regarding the design of the electronics.

To Carles, for giving ideas and fast solutions to the problems and to make always that working in the lab was a real pleasure.

To Prof. Dr. Hans-Christian Schultz-Coulon and PicoSEC<sup>[1](#page-112-0)</sup>, for the continuous support during the last years and making possible the accomplishment of this PhD project.

To Volker, for the large amount of time spent with me and for showing me the secrets of the PCB design.

To Tobias, Konrad, Huangshan, Wei and Yonathan of the detector development group at KIP for answering all my questions and doubts.

To Alex, for being next to me from the beginning and giving me courage and strength to continue during the difficult periods.

To my father, who opened my mind and my curiosity to the electronics.

#### Thanks, Gracias, Gràcies, Dankeschön, Благодаря and 谢谢

<span id="page-112-0"></span><sup>&</sup>lt;sup>1</sup>Marie Curie Early Initial Training Network Fellowship of the European Community's Seventh Framework Program under contract number (PITN-GA-2011-289355-PicoSEC-MCNet).

## Bibliography

- [CLA] C. Lacasta, "DAQ++: A C++ a Data Acquisition Software Framework", Real-Time Conference, 2007 15th IEEE-NPSS [6](#page-94-0)
- [Etx] A. Etxebeste, Medical Physics master thesis, 2012.
- [Llo] G. Llosa, J. Barrio, J. Cabello, C. Lacasta, J. F. Oliver, M. Rafecas, V. Stankova, C. Solaz, M. G. Bisogni, A. Del Guerra, "Detectors based on Silicon Photomultiplier Arrays for Medical Imaging Applications" 978-1-4577-0927-2/11/ IEEE.
- [Ham] Hamamatsu, MPPC and MPPC module for precision measurement, http://www.hamamatsu.com [2.1,](#page-24-0) [2.2,](#page-26-0) [2.3\(a\),](#page-26-1) [2.5](#page-28-0)
- [She] Wei Shen. Development of High Performances Readout ASICs for Silicon Photomultipliers (SiPMs). Ohd thesis, Heidelberg University, 2012 [2.1,](#page-24-0) [2.1.1,](#page-25-0) [2.1.5](#page-28-1)
- [Mpp] Hamamatsu MPPC User Mannual, http://www.hamamatsu.com
- [Mpc] Hamamatsu, MPPC modules, http://www.hamamatsu.com
- [Eck] Patrick Eckert, Hans-Christian Schultz-Coulon, Wei Shen, Rainer Stamen, and Alexander Tadday. Characterisation studies of silicon photomultipliers. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, 620(2): 217-226, 2010. [2.1.5](#page-28-1)
- [She] Wei Shen, Development of High Performance Readout ASICs for Silicon Photomultipliers (SiPMs), PhD thesis, Universitat Heidelberg, 2012. [2.1,](#page-24-0) [2.1.1,](#page-25-0) [2.1.5](#page-28-1)
- [Ide] Gamma Medica-Ideas, Inc. (Norway), http://www.GM-ideas.com [2.2,](#page-30-0) [2.8](#page-31-0)
- [Har] Tobias Harion, The STiC ASIC Development, Characterization and System Integration, PhD thesis, Universitat Heidelberg, 2015. [1.2,](#page-14-0) [1.2.1,](#page-15-0) [2.3\(b\),](#page-26-2) [2.1.3,](#page-27-0) [2.6,](#page-29-0) [3.2,](#page-36-0) [7.5](#page-103-0)
- [Vac] Antonin Vacheret, et al. Characterization and simulation of the response of multi-pixel photon counters to low light levels. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, 656(1):69-83, 2011. [2.1.2](#page-27-1)
- [Dat] Hamamatsu Photonics KK. Mppc s1257-010 datasheet. Technical report, 2013. [2.1.2](#page-27-1)
- [Bog] Bogdan Povh, Martin Lavelle, Klaus Rith, Christoph Scholz, and Frank Zetsche. Particles and nuclei: an introduction to the physical concepts. Springer Science and Business Media, 2008. [1.2.1](#page-16-0)
- [Con] Maurizio Conti, Inki Hong, and Christian Michel. Reconstruction of scattered and unscattered pet coincidences using tof and energy information. Physics in Medicine and Biology, 57(15):N307, 2012. [1.2.1](#page-16-1)
- [Mos] W.W. Moses, Time of flight in PET revisited. Nuclear Science, IEEE Transactions on, 50(5):1325-1330, Oct 2003. [1.1,](#page-12-0) [1.2.2](#page-17-0)
- [Tom] Takehiro Tomitani. Image reconstruction and noise evaluation in photon time-offlight assisted position emission tomography. Nuclear Science, IEEE Transactions on, 28(6):4581-4589, Dec 1981. [1.2.2](#page-17-0)
- [Cont] Maurizio Contu. Focus on time-of-flight PET: the benefits of improved time resolution. European journal of nuclear medicine and molecular imaging, 38(6):1147-1157, 2011. [1.2.2](#page-17-0)
- [Mad] V. Stankova "Development of a data acquisition system for silicon multi-pixel detectors", Research work, 2012
- [Llos] Gabriela Llosa, John Barrio, Jorge Cabello, John E. Gillam, Carlos Lacasta, Josep. F. Oliver, Magdalena Rafecas, Carles Solaz, Paola Solevi, Vera Stankova, Irene Torres-Espallardo, Marco Trovato "Second LaBr3 Compton Telescope Prototype", 978-1-4799-1047-2/13/ IEEE
- [Sol] C. Solaz, J. Barrio, G. Llosa, V. Stankova, M. Trovato, C. Lacasta, "Data Acquisition System for the Readout of SiPM Arrays", 978-1-4799-0534-8/13/ IEEE
- [Lal] Omega micro, Laboratoire Leprince Ringuet, Palaiseau, France), HTTP://omega.in2p3.fr/ [2.2](#page-30-0)
- [Sti] Wei Shen, Briggl, K., Huangshan Chen, Fischer, P., Gil, A., Harion, T., Ritzert, M., Schultz-Coulon, H.-C., "STiC - A mixed mode chip for SiPM ToF applications", DOI: 10.1109/NSSMIC.2012.6551231 [2.2](#page-30-0)
- [AD8] Analog Devices, AD8139 datasheet, http://www.analog.com/ [4.3.2](#page-49-0)
- [AD9] Analog Devices, AD9222 datasheet, http://www.analog.com/ [4.3.3,](#page-51-0) [4.7](#page-55-0)
- [Lit] Linear Technology, LTC2620, datasheet, http://www.linear.com/product/LTC2620 [4.3.4](#page-55-1)
- [Hor] Horowitz, P. and Hill, W. "The art of electronics." Cambridge University Press, 1989. [4.7](#page-72-0)
- [LT1] Linear Technology, LT1185, data sheet. [4.5.1](#page-68-0)
- [XAP] Xilinx, XAPP774, Connecting Xilinx FPGAs to Texas Instruments ADS527x Series ADCs, 2006 [4.5.2,](#page-70-0) [5.2.6](#page-86-0)
- [DP8] National Semiconductors, DP83865, data sheet. [4.4.9](#page-64-0)
- [Xip] Xilinx User Gide, "LogiCORE IP Tri-Mode Ethernet MAC v4.5", UG138 March 1, 2011. [4.4.9,](#page-64-0) [5.2.10](#page-91-0)
- [Ads] XIlinx, UG161, Platform Flash PROM User Guide, 2009 [4.12](#page-65-0)
- [Fla] XIlinx, DS123 Platform Flash In-System Programmable Configuration PROMS, 2005 [4.4.8](#page-63-0)
- [ADP] Analog Devices, ADP3339 High Accuracy, Ultralow IQ, 1.5 A, anyCAP Low Dropout Regulator [4.13,](#page-67-0) [4.14](#page-68-0)
- [Pth] Texas Instruments, PTH5050W, data sheet. [4.5.1](#page-68-0)
- [Pth2] Texas Instruments, PTH4050a, data sheet. [4.5.1](#page-68-0)
- [Ale] Xilinx, DS162 "Spartan-6 FPGA Data Sheet: DC and Switching Characteristics". [4.4.7](#page-61-0)
- [Ppg] Xilinx, UG393 "Spartan-6 FPGA PCB Design and Pin Planning Guide". [4.4.7,](#page-61-0) [4.4.7](#page-63-1)
- [Xil] Xilinx, "Spartan-6 FPGA family" data sheet., http://www.xilinx.com/
- [Dec] Xilinx, XAPP623 "Power Distribution System (PDS) Design: Using Bypass/Decoupling Capacitors". [1,](#page-12-1) [4.5](#page-63-1)
- [Wid] Albert X. Widmer and Peter A. Franaszek. A dc-balanced, partitioned-block, 8b/10b transmission code. IBM Journal of research and development, 27(5):440- 451,1983. [5.2.4](#page-84-0)
- [Wu] J. Wu and Z. Shi, The 10-ps wave union TDC: Improving FPGA TDC resolution beyond its cell delay. IEEE Nucl. Sci. Conf. R. (2008)3440.
- [Kin] Barton R D and King M E. Two vernier time-interval digitizers. Nucl. Instrum. Methods, 97(359-70), 1971. [5.2.9](#page-88-0)
- [And] M.S. Andaloussi, M. Boukadoum, and E.M. Aboulhamid. A novel time-todigital converter with 150 ps time resolution and 2.5ns pulse-pair resolution. Microelectronics, The 14th International Conference on 2002 - ICM, pages 123- 126, Dec. 2002. [5.2.9](#page-88-0)
- [Fis] I. Sacco, P. Fischer, and M. Ritzert. Peta4: a multi-channel tdc/adc asic for sipm readout. Journal of Instrumentation, 8(12):C12013, 2013 [3.3](#page-36-1)
- [Sta] Stamatakis, A. The Exelixis Lab, "Efficient PC-FPGA communication over gigabit Ethernet". [5.2.10](#page-91-0)
- [Far] Faraday Technology Corporation. Umc 0.18µm generic ii library. http://freelibrary. faraday.com.tw. [3.5](#page-41-0)
- [Rit] Michael Ritzert. Development and Test of a High Performance Multi Channel Readout System on a Chip with Application in PET/MR. PhD thesis, Universitat Heidelberg, 2014. [3.3](#page-36-1)
- [Alb] Albert X. Widmer and Peter A. Franaszek. A dc-balanced, partitioned-block, 8b/10b transmission code. IBM Journal of research and development, 27(5):440- 451, 1983. [3.4](#page-38-0)
- [Bai] D. L. Bailey, D. W. Townsend, P. E. Walk, and M. N. Maisey, "Positron Emission Tomography. Sprinter", London, 2005 [1.2.6](#page-19-0)
- [Lin] PET and PET/CT: A Clinical Guide by Eugene C. Lin from Thieme. [1.2](#page-14-0)
- [Wer] M. N. Wernick and J. N. Aarsvold, Emission tomography. The Fundamentals of PET and SPECT. Elsevier academic press, 2004 [1.2](#page-13-0)
- [End] M. Endo, "Recent progress in medical imaging technology" Systems and Computers in Japan, vol. 36, pp. 1-17, 2005 [1,](#page-12-1) [1.1](#page-12-0)
- [Hen] W. R. Hendee, " Physics and application of medical imaging", Rev. Mod. Phys., vol. 71, no 2, pp. 444-450, 1999. [1,](#page-12-1) [1.1](#page-12-0)
- [Kal] J. Kalisz, R. Szplet, J. Pasierbinski, and A. Poniecki. Field-programmable-gatearray-based time-to-digital converter with 200-ps resolution. Instrumentation and Measurement, IEEE Transactions on, 46(1):51-55, Feb 1997. [7.2.1](#page-102-0)
- [Mos] W. W. Moses, "Overview of nuclear medical imaging instrumentation and techniques.", Proc SCFIF97 Conference on Scintillating and Fiber Detectors, vol. 450, pp. 477-488, 1997 [1.1,](#page-12-0) [1.2.2](#page-17-0)
- [Fav] Claudio Favi and Edoardo Charbon. A 17ps time-to-digital converter implemented in 65nm fpga technology. In Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays, pages 113-120. ACM, 2009. [7.2.1](#page-103-0)
- [Mot] Manuel Mota and Jorgen Christiansen. A high-resolution time interpolator based on a delay locked loop and an rc delay line. Solid-State Circuits, IEEE Journal of, 34(10):1360-1366, 1999. [7.2.1](#page-102-0)