## Data Acquisition System for the Belle Silicon Vertex Detector Upgrade

Yamashita Yasushi Department of Physics, University of Tokyo

January 10th, 2002

# Contents

| 1 Introduction |     |               |                                                                                         | 7  |  |  |
|----------------|-----|---------------|-----------------------------------------------------------------------------------------|----|--|--|
| 2              | KEI | KEK B-Factory |                                                                                         |    |  |  |
|                | 2.1 | KEKE          | Accelerator                                                                             | 10 |  |  |
|                | 2.2 | Belle I       | Detector                                                                                | 11 |  |  |
|                |     | 2.2.1         | Silicon Vertex Detector (SVD)                                                           | 13 |  |  |
|                |     | 2.2.2         | Central Drift Chamber (CDC)                                                             | 16 |  |  |
|                |     | 2.2.3         | Aerogel Čerenkov counter (ACC)                                                          | 17 |  |  |
|                |     | 2.2.4         | Time/Trigger of Flight Counter (TOF)                                                    | 18 |  |  |
|                |     | 2.2.5         | Electromagnetic calorimeter (ECL)                                                       | 19 |  |  |
|                |     | 2.2.6         | Solenoid magnet                                                                         | 20 |  |  |
|                |     | 2.2.7         | $K_L$ and Muon Detector (KLM) $\ldots \ldots \ldots \ldots \ldots \ldots \ldots \ldots$ | 21 |  |  |
|                |     | 2.2.8         | Trigger System                                                                          | 22 |  |  |
|                |     | 2.2.9         | Data Acquisition System                                                                 | 23 |  |  |
| 3              | Upg | rade c        | of the SVD                                                                              | 26 |  |  |
|                | 3.1 | SVD2          | Detector                                                                                | 26 |  |  |
|                |     | 3.1.1         | Mechanical Structure of the SVD2                                                        | 27 |  |  |
|                |     | 3.1.2         | Double-Sided Strip Detector                                                             | 29 |  |  |
|                | 3.2 | SVD2          | Data Acquisition System                                                                 | 30 |  |  |
|                |     | 3.2.1         | Front-end Readout Chip                                                                  | 31 |  |  |
|                |     | 3.2.2         | Repeater System                                                                         | 32 |  |  |
|                |     | 3.2.3         | Flash Analog-to-Digital Converter                                                       | 32 |  |  |
|                |     | 3.2.4         | PC Farm                                                                                 | 34 |  |  |
| 4              | PCI | Board         | ł                                                                                       | 36 |  |  |
|                | 4.1 | Requir        | rements for PPCI                                                                        | 36 |  |  |

|          | Hardware Description of PPCI | 37                                                                                              |    |
|----------|------------------------------|-------------------------------------------------------------------------------------------------|----|
|          | 4.3                          | Transfer Protocol between FADC and PPCI                                                         | 37 |
|          | 4.4                          | Measurement of Data Receiving Speed with PPCI                                                   | 40 |
|          | 4.5                          | Test for Error Rate                                                                             | 41 |
|          | 4.6                          | Performance Summary of PPCI                                                                     | 44 |
| <b>5</b> | Dat                          | a Processing in PC                                                                              | 15 |
|          | 5.1                          | Overview of Data Process in a PC                                                                | 46 |
|          |                              | 5.1.1 Read from PPCI Thread                                                                     | 46 |
|          |                              | 5.1.2 Double Buffer $\ldots$ $\ldots$ $\ldots$ $\ldots$ $\ldots$ $\ldots$ $\ldots$ $\ldots$     | 47 |
|          |                              | 5.1.3 Data Process and Reformat Thread                                                          | 48 |
|          |                              | 5.1.4 Merge Buffer                                                                              | 48 |
|          |                              | 5.1.5 Merge Event and Send to Event Builder Thread                                              | 48 |
|          | 5.2                          | Explanation for Characteristic Values of FADC                                                   | 48 |
|          | 5.3                          | Data Processing and Reformatting                                                                | 49 |
|          |                              | 5.3.1 Pedestal Subtraction                                                                      | 49 |
|          |                              | 5.3.2 Common Mode Subtraction                                                                   | 50 |
|          |                              | 5.3.3 Signal to Noise Ratio Calculation                                                         | 50 |
|          |                              | 5.3.4 Signal Hit Search                                                                         | 51 |
|          |                              | 5.3.5 Data Reformatting                                                                         | 51 |
|          | 5.4                          | System Test Environments                                                                        | 53 |
|          | 5.5                          | Event Builder for System Test                                                                   | 54 |
|          | 5.6                          | Limitation of Event Builder                                                                     | 54 |
|          | 5.7                          | Compression Ratio                                                                               | 55 |
|          | 5.8                          | Speed Test                                                                                      | 57 |
|          | 5.9                          | Stability Test                                                                                  | 59 |
|          | 5.10                         | Summary and Discussion of Data Processing in PC                                                 | 60 |
| 6        | Con                          | clusion                                                                                         | 34 |
| A        | cknov                        | vledgement                                                                                      | 36 |
| A        | CP                           | Violation in B Decays                                                                           | 39 |
|          | A.1                          | Introduction                                                                                    | 69 |
|          | A.2                          | Cabibbo-Kobayashi-Maskawa Matrix                                                                | 70 |
|          | A.3                          | Measuring $CP$ Asymmetry in $B$ Meson Decays $\ldots \ldots \ldots \ldots \ldots \ldots \ldots$ | 72 |

# List of Figures

| 2.1  | Configuration of the KEKB accelerator system.                                    | 11 |
|------|----------------------------------------------------------------------------------|----|
| 2.2  | Schematic view of the Belle detector.                                            | 12 |
| 2.3  | Side and end views of the Belle SVD                                              | 14 |
| 2.4  | Block diagram of the SVD readout system                                          | 15 |
| 2.5  | Block diagram of the VA1 chip                                                    | 15 |
| 2.6  | Structure of central drift chamber                                               | 17 |
| 2.7  | The configuration of the ACC                                                     | 18 |
| 2.8  | The configuration of the TOF/TSC                                                 | 19 |
| 2.9  | The configuration of the electromagnetic calorimeter. $\ldots$ $\ldots$ $\ldots$ | 20 |
| 2.10 | The barrel and endcap parts of the Belle KLM.                                    | 21 |
| 2.11 | Belle trigger system.                                                            | 22 |
| 2.12 | Schematic view of the Belle DAQ system                                           | 25 |
| 3.1  | The $r$ - $z$ view of the SVD2                                                   | 27 |
| 3.2  | The $r$ - $\phi$ view of the SVD2                                                | 27 |
| 3.3  | The schematic drawings of the ladder. Before (upper) and after (lower)           |    |
|      | assembly                                                                         | 28 |
| 3.4  | Schematic drawing of the double-sided silicon strip detector (DSSD).             | 30 |
| 3.5  | Schematic drawing of the SVD readout system.                                     | 31 |
| 3.6  | The block diagram of the FADC board                                              | 33 |
| 3.7  | The data format of FADC output. The numbers above the bar (from                  |    |
|      | 0 to 31) are bit numbers and numbers in the bar are the content tag.             | 34 |
| 3.8  | Data format of the FADC output of a single event                                 | 35 |
| 11   | The picture of the DDCI IV board                                                 | 27 |

| 4.2        | The signals for between the FADC and PPCI. The clock signal is                                           |             |
|------------|----------------------------------------------------------------------------------------------------------|-------------|
|            | XCLOCK and data lines are XDATA. XREQUEST and XVALID are                                                 |             |
|            | sent from the FADC to the PPCI, and XREADY and XENABLE are                                               |             |
|            | from the PPCI to the FADC.                                                                               | 38          |
| 4.3        | The timing chart of the beginning of the data transfer between the                                       |             |
|            | FADC and the PPCI                                                                                        | 39          |
| 4.4        | The timing chart at the termination of the data transfer by the FADC                                     |             |
|            | between the FADC and the PPCI                                                                            | 39          |
| 4.5        | The timing chart at the termination of data transfer due to the the                                      |             |
|            | PPCI busy between the FADC and the PPCI                                                                  | 40          |
| 4.6        | The schematic drawing of the test bench for the transfer speed test.                                     |             |
|            | The PPCI, instead of the FADC, sends data, and the other PPCI                                            |             |
|            | recieves the data. Only one PPCI is installed in a sender PC to reduce                                   |             |
|            | the CPU and bus load of the PC. The sender and the receiver PPCI is                                      |             |
|            | connected with a 5-meter LVDS cable                                                                      | 41          |
| 4.7        | The data receiving speed as a function of the number of PPCI boards.                                     |             |
|            | Solid line and the dashed line show the speed per PPCI and the total                                     |             |
|            | speed, respectively. The study is performed without any data pro-                                        |             |
|            | cessing in the receiver PC to avoid possible latency. Even four PPCIs                                    |             |
|            | are installed in the PC, the PPCI itself still clears the requirement of                                 |             |
|            | 6 MB/s, which is indicated by the horizontal line.                                                       | 42          |
| 4.8        | The schematic drawing of the test bench for the error test of the PPCI.                                  |             |
|            | Three PPCIs are installed in one PC, where each PPCI is connected                                        |             |
|            | to another PPCI installed in another PC with a 5-meter LVDS cable.                                       | 43          |
| 51         | The flowebart of the data processing in the PC in ease that three PPCIs                                  |             |
| 0.1        | are installed. A square an arrow, and an ellipse indicate a thread a                                     |             |
|            | data stream, and a shared memory respectively.                                                           | 46          |
| 59         | The schematic drawing of how the double buffer works                                                     | 40          |
| 5.2<br>5.3 | The flowchart of the data compression                                                                    | 40          |
| 5.4        | The hotched area shows the fluctuation of the pedestal and common                                        | 49          |
| 0.4        | mode subtracted ADC counts. The PMS of the fluctuation is defined                                        |             |
|            | as the intrinsic poise. The signal ADC count is podestal, and common                                     |             |
|            | as the intrinsic noise. The signal ADC count is pedestal- and common-                                    | 50          |
| 55         | A charged particle ponetrating a DSCD with (a) a small incidence and                                     | 50          |
| 0.0        | A charged particle penetrating a DSSD with (a) a small incidence angle<br>and (b) a large incident angle | ະາ          |
|            | and $(D)$ a large incident angle                                                                         | $_{\rm OZ}$ |

| 5.6  | The data packing format of one hybrid. The formatted data consist of         |    |
|------|------------------------------------------------------------------------------|----|
|      | a header, strip hit data, pedestal and noise values, and a footer. $\ . \ .$ | 52 |
| 5.7  | Ladder and scintillation counter arrangement for the system test. In-        |    |
|      | stalled ladders are represented by filled boxes and vacant ladder slots      |    |
|      | are represented by open boxes                                                | 53 |
| 5.8  | The data acquisition system from the PC farm to storage system for           |    |
|      | the system test                                                              | 54 |
| 5.9  | Time Dependence of Data-Processing Speed                                     | 56 |
| 5.10 | Data-Processing Speed in Various Number of Sender PCs                        | 57 |
| 5.11 | Compression Data Size                                                        | 58 |
| 5.12 | Process speed in PC                                                          | 60 |
| 5.13 | Pedestal Alteration in Long Run Test                                         | 61 |
| 5.14 | Intrinsic Noise Alteration in Long Run Test                                  | 62 |
| A 1  |                                                                              | 70 |
| A.1  | I ne unitarity triangle                                                      | 72 |
| A.2  | The Feynman diagrams responsible for $B^0-\overline{B}{}^0$ mixing           | 73 |
| A.3  | The Feynman diagrams of semi-leptonic $B^0$ and $\overline{B}^0$ decays      | 74 |

# List of Tables

| 2.1 | Total cross section and trigger rates with $L = 10^{34} \text{cm}^{-2} \text{s}^{-1}$ from various physics processes at $\Upsilon(4S)$ | 24 |
|-----|----------------------------------------------------------------------------------------------------------------------------------------|----|
| 3.1 | Inner radius, number of DSSDs in $\phi$ , and number of DSSDs in z for                                                                 |    |
|     | each layer                                                                                                                             | 29 |
| 5.1 | Specifications of PC Used for Data Compression                                                                                         | 45 |
| 5.2 | Specifications of the PC used for the event builder                                                                                    | 55 |
| 5.3 | Corresponding table between of occupancy and compression ratio                                                                         | 58 |
| 5.4 | Meassured occupancy and distance from the center of the beam pipe                                                                      |    |
|     | of the SVD1                                                                                                                            | 63 |
| 5.5 | The occupancy estimation and distance from the center of beam pipe                                                                     |    |
|     | at the SVD2                                                                                                                            | 63 |

## Chapter 1

## Introduction

One of the most important discoveries in the modern elementary particle physics is the existence of the CP violation. The CP violation is expected to be related to the basic principle of the nature; It is expected to answer one of the most attractive questions in cosmology and in elementary particle physics, why the universe we currently live in consists predominantly of the matter. The CP violation was discovered in the neutral K meson system by L. H. Christenson *et al.* in 1964 [1]. In 1973, M. Kobayashi and T. Maskawa proposed the KM-model, which was an extension of the Cabbibo matrix, and could give an explanation for the CP violation within the framework of the Standard Model (SM) [2]. In 1980, Sanda and Carter pointed out that sizable CP violating asymmetries would be able to be observed in certain decay modes of the B mesons [3]. To observe the CP violation and to test the KM-model in the B meson system, we started a B-factory experiment, Belle, using the  $e^+e^$ collider (KEKB) at the High Energy Accelerator Research Organization (KEK) in 1999. The center-of-mass energy of the Belle experiment is just set to the mass of the  $\Upsilon(4S)$  state, where a pair of the B and  $\overline{B}$  mesons are produced. The KEKB achieved the high luminosity of  $8.26 \times 10^{33}$  cm<sup>-2</sup>s<sup>-1</sup>, and has produced a large number of  $\Upsilon(4S)$ mesons.

In the  $\Upsilon(4S)$  system, a time-dependent CP violating asymmetry,  $A_{CP}$ , of B mesons decaying into a CP eigenstate,  $f_{CP}$ , is given by

$$A_{CP}(\Delta t) = \frac{\Gamma(\overline{B}{}^0 \to f_{CP}) - \Gamma(B^0 \to f_{CP})}{\Gamma(\overline{B}{}^0 \to f_{CP}) + \Gamma(B^0 \to f_{CP})} = S \sin \Delta m_d \Delta t + \mathcal{A} \cos \Delta m_d \Delta t, \quad (1.1)$$

where S and A are parameters related to the CP violation in the KM-model,  $\Delta m_d$ is the mass difference between two  $B^0$  mass eigenstates. The  $\Delta t$  is defined as the proper-time difference,  $\Delta t \equiv t_{CP} - t_{\text{tag}}$ , where  $t_{CP}$  and  $t_{\text{tag}}$  are proper times for the  $B^0 \rightarrow f_{CP}$  decay and for the associated B meson decay, respectively. Since the integration of  $\mathcal{S}$  term of  $A_{CP}$  by  $\Delta t$  vanishes, a precise determination of  $\Delta t$  is crucial for the observation of the time-dependent CP violating asymmetry.

Since KEKB is an energy-asymmetric  $e^+e^-$  collider, with the 8.0 GeV electron beam and the 3.5 GeV positron beam, it yields  $\Upsilon(4S)$  with a constant motion along the electron beam direction (z-axis). Since the mass of the  $\Upsilon(4S)$  is close to twice the  $B^0$  mass, two  $B^0$  mesons are produced almost at rest in the  $\Upsilon(4S)$  rest frame, and almost constant motion along with z-axis is given to both B mesons. We can calculate the  $\Delta t$  by measuring two B decay vertices

$$\Delta t \simeq \frac{z_{CP} - z_{\text{tag}}}{\beta \gamma_B c} \equiv \frac{\Delta z}{\beta \gamma_B c} \tag{1.2}$$

where  $z_{CP}$  and  $z_{\text{tag}}$  are the decay vertices of  $B^0 \to f_{CP}$  and the associated *B* decay, respectively, and  $\beta \gamma_B$  (0.425 in our experimental configuration) is the Lorentz boost factor of two *B* mesons. The typical value for  $\Delta z$  is 200  $\mu$ m because *B* meson lifetime is about 1.5 ps.

For the precise determination of  $\Delta z$ , we have developed a silicon vertex detector (SVD) that have achieved  $\Delta z$  resolution of ~ 200  $\mu$ m, and we have established the *CP* violation in the neutral *B* meson decays [4]. However, we still have many unsolved problems and challenges: we need much more precision for the *CP* violating parameter measurement in order to check if the KM-model is the only source of CP violation. A precise determination of the remaining parameters in the KM-model and searches for other new physics. To this end we need more precise measurement of *B* decay vertices and much higher beam luminosity.

We are planning the upgrade of the Belle detector. The current SVD consisting of three cylindrical layers of silicon detectors will be replaced with four-layer SVD. AS a consequence, the data size is expected to be at least 1.5 times larger than the current size. To accommodate the higher luminosity, the maximum trigger rate will be increased by a factor of two.

In order to satisfy these requirements, we decide to construct the PC-based data acquisition system (DAQ) for the upgraded SVD. Our new solution has many advantages compared to the current DAQ system, which is based on the VME bus. PCs can be purchased inexpensively than the VME CPU modules, and their maintenance is made easily then the VME modules. Furthermore, the PC market is evolving day by day, which enables us to replace the system component easily with the best commercial products at the period with very low prices. Data size reduction will be made with a processing elements (CPU) in the PCs by pedestal subtraction, particle hit finding, and data compression. The software can be developed with less skill in the C programming language than the assembly language of Motorola DSP, which is used for current SVD DAQ. We use PCI module to receive data from flash analog to digital converter.

In this thesis, we present the performance of the PC based data acquisition system for the upgraded Belle silicon vertex detector. including data receiving speed of PCI module, its stability and reliability test, and the processing speed of the data reduction software.

This thesis is organized as follows: An experimental apparatus, consisting of the KEKB accelerator, the Belle detector including the current data acquisition system, is described in Chapter 2. The overview of the SVD upgrade is presented in Chapter 3. We describe the PCI module and its performance in Chapter 4. In Chapter 5, we present the development and performance of the data reduction software. Chapter 6 concludes this thesis.

## Chapter 2

## **KEK** *B***-Factory**

The primary goal of the KEK *B* factory experiment is the study of *CP* violation in the *B* meson system. The KEKB accelerator is an asymmetric  $e^+e^-$  collider to produce *B* mesons. The decay products of *B* mesons are detected by the Belle detector. In Section 2.1, a brief introduction of the KEKB accelerator is given. In Section 2.2, the overview of Belle detector and the description of its principal components are given.

## 2.1 KEKB Accelerator

As described in the previous section, to measure CP asymmetry in decays of B mesons, one must produce  $\Upsilon(4S)$  boosted in the laboratory frame. The KEKB accelerator [10] was designed to realize this. It has two rings in a tunnel which was used for TRISTAN, one for the electron beam and the other for the positron beam. The circumference of main rings are about 3 km. The configuration of the KEKB accelerator is shown in Fig. 2.1.

Beam energies are chosen to be 8 GeV for electron and 3.5 GeV for the positron, so that the center of mass energy comes on the  $\Upsilon(4S)$  resonance. In such configuration,  $\beta \gamma \simeq 0.425$  and the average decay length of  $B_{\rm S}$  from  $\Upsilon(4S)$  is about 200  $\mu$ m in the laboratory frame.

In order to produce as much B mesons as possible, KEKB accelerator is designed to run at highest luminosity in the world,  $10^{34}$  cm<sup>-2</sup>s<sup>-1</sup>, corresponding to  $10^{8}$  of  $B\overline{B}$ pairs in a year. As of the end of 2002, KEKB has achieved a peak luminosity of  $8.26 \times 10^{33}$  cm<sup>-2</sup>s<sup>-1</sup>.



Figure 2.1: Configuration of the KEKB accelerator system.

### 2.2 Belle Detector

The Belle detector (Fig. 2.2) is a  $4\pi$  detector designed for the study of CP violation in B meson system [11, 12]. The Belle detector consists of Central Drift Chamber (CDC), Aerogel Cherencov Counter (ACC) and Time of Flight counter (TOF), CsI calorimeter (ECL),  $K_L$  and Muon detector (KLM), and Silicon Vertex Detector (SVD).

As is described in the previous chapter, precise vertex measurement and flavor tagging of B mesons are required for the study of CP violation. There are a number of candidates for the vertex detector, e.g. scintillating fiber, gas microstrip detector, etc. We decided to use double-sided silicon microstrip detector as vertex detector in the Belle experiment. The intrinsic resolution order of 10  $\mu$ m can be achieved by the silicon strip detector. The effect of multiple scattering, which is of great importance in the Belle momentum region (typically  $p \sim 1$  GeV), is minimized by use of double-sided silicon detector.



Figure 2.2: Schematic view of the Belle detector.

The flavor of B meson which decays into CP eigenstate  $(B_{CP})$  is determined by the flavor of the other B meson  $(B_{tag})$ . There are mainly two methods to tag the flavor of B. One method is based on semileptonic B decays. The charge of leptons  $(e \text{ and } \mu)$  indicates the flavor of the B. The charge of kaon also indicates the flavor of the B, since  $b \to c \to s$  decay chain is dominant process. Good  $K/\pi$  separation is required for the kaon tagging.

In the Belle, electrons are identified by the CDC and ECL. Muons are detected by the KLM. The CDC, TOF and ACC provide  $K/\pi$  separations up to 3.5 GeV/c. The CDC covers the momentum up to 0.8 GeV/c, the TOF covers up to 1.2 GeV/c and the ACC covers momentum range 1.2 GeV/c.

Charged tracks are primarily reconstructed by the CDC. Photons are detected by the ECL.  $K_L$ 's are detected by the KLM.

#### 2.2.1 Silicon Vertex Detector (SVD)

The main task of the Silicon Vertex Detector (SVD) [13] is to reconstruct the decay vertices of two primary B mesons in order to determine the time difference between two decays. The SVD is designed so that its intrinsic resolution is expected to be a few tens of  $\mu$ m, which is much better than the resolution of a wire drift chamber. The Belle SVD is situated just inside the CDC and reconstructs precise tracks of charged particles combining the CDC measurement. The CDC can reconstruct low momentum tracks down to  $p_T$  of about 70 MeV/c since inner radius of the CDC is about 8 cm. The SVD is not required to function as a stand-alone tracker for low  $p_T$ tracks.

At the KEKB, the decay distance of  $B\overline{B}$  pairs is about 200  $\mu$ m in average. So the target of the SVD vertex resolution was set to < 200  $\mu$ m.

The tracks produced at KEKB are rather soft and multiple-Coulomb scattering is a dominant source of the vertex resolution degradation. This imposes strict constraints on the detector design and the mechanical layout. The innermost layer of the support structure must be low mass but stiff; and the readout electronics must be placed outside of the tracking volume.

#### 2.2.1.1 Detector configuration

The SVD has three cylindrical layers consisting of units of the silicon sensors. The position of each layer is 3.0 cm, 4.55 cm and 6.05 cm in r direction, respectively. The SVD covers  $23^{\circ} < \theta < 139^{\circ}$ , corresponding to the angular acceptance of 86 %. The three layers have 8, 10 and 14 sensor ladders in  $\phi$ . The structure of the SVD is shown in Fig. 2.3.

Each layer is constructed from double-sided silicon strip detectors (DSSDs) and the front end electronics. We use the S6939 DSSD fabricated by HAMAMATSU Photonics. The one side (= n-side) of the DSSD has  $n^+$ -strips oriented perpendicular to the beam direction to measure the z coordinate and the other side (= p-side) with longitudinal  $p^+$ -stripes, allows  $\phi$  coordinate measurement. The z strip pitch is 42  $\mu$ m and the  $\phi$  strip pitch is 25  $\mu$ m. The bias voltage of 75 V is supplied to the n-side and p-side is grounded. P-strips and n-strips detect electron-hole pairs which are induced by charged tracks.



Figure 2.3: Side and end views of the Belle SVD.

#### 2.2.1.2 Readout Electronics

Figure 2.4 shows a block diagram of the overall readout system. The signals from DSSD are read out by VA1 chip mounted on hybrid board and transferred via thin cables to a buffer card (ABC) followed by thicker cables and a repeater board (REBO). The REBO drives a  $\sim 30$  m long cable to send differential output to the receivers located in the electronics hit, where Flash ADC (FADC) cards are installed to digitize the analog signal and send the digitized signal to the central DAQ system.

**VA1 Chip** VA1 chip [14], which is originally developed at CERN and commercially available from IDEAS, Oslo, Norway, is used as frontend VLSI chip. The VA1 is 128 channel CMOS integrated circuit designed for the readout of silicon vertex detectors and other small signal devices. It has extremely good noise performance (equivalent noise charge =  $165 e^- + 6.1 e^-/\text{pF}$  with 2.5  $\mu$ s shaping time) and consumes only 1.2 mV/channel. And the VA1 is radiation tolerant to levels of order 200 kRad [15].

Figure 2.5 shows the block diagram of the VA1 chip. Signals from the strips are amplified by a charge sensitive amplifier, followed by a CR-RC shaper and sample and hold circuit. The output voltage is read out sequentially when trigger comes. A single bit propagating through a shift register causes the output switched to output amplifier to be closed one at a time. In this way, all 128 channels in a chip can be read out through a single ADC.



Figure 2.4: Block diagram of the SVD readout system.

Figure 2.5: Block diagram of the VA1 chip.

**CORE (COntrol and REpeater) System** Repeater system [16] consists of small cards near the detector (ABCs), boards for signal buffering and fronted control (REBO), a board for monitoring (RAMBO), a mother board (MAMBO) and their cooling shielding case (DOCK).

Signals from a side of DSSD are read by five VA chips on a hybrids board and sent to repeater system through special 30 wires thin cables, which were provided by Omnetix Co. ltd, USA. Two cables from two hybrid boards are merged on an ABC with 50 wires flat cable. The flat cable is connected to a MAMBO contained in a DOCK. A MAMBO has five slots. A RAMBO occupies the center slot, and four REBOs use the other slots.

Signals from ABC are sent to REBO through a local bus. After amplification and filtering, differential signals are sent to backend electronics system located in the electronics hut. A thermistor is mounted on each hybrid board to monitor the temperature of the frontend electronics. Voltage on a thermistor is read out by RAMBO. All the timing and control signals, bias voltages and power supplies from backward electronics are fed to MAMBO and then distributed to each board or sent to frontend electronics.

Since DOCK can control and read out eight ABCs(or 16 hybrids), we needed 8 DOCKs (8 MAMBOs and RAMBOs, 32 REBOs) for 128 hybrids on 32 ladders.

#### 2.2.1.3 Performance

The impact parameter resolution in the plane perpendicular to the beam axis and along the beam direction are measured to be

$$\sigma_{r\phi}^2 = (21)^2 + \left(\frac{69}{p\beta\sin^{3/2}\theta}\right)^2 \mu m, \quad \sigma_z^2 = (39)^2 + \left(\frac{69}{p\beta\sin^{5/2}\theta}\right)^2 \mu m$$
(2.1)

respectively, where p is the momentum measured in GeV/c and  $\beta$  is the velocity divided by c.

#### 2.2.2 Central Drift Chamber (CDC)

The main role of the Central Drift Chamber (CDC) is the detection of charged particles. Specifically, the physics goals of Belle experiment require a momentum resolution better than  $\sigma_{p_t}/p_t \sim 0.5 \cdot \sqrt{1+p_t^2}$  % for all charged particles with  $p_t \geq$ 100 MeV/c. In addition, The CDC is expected to provide particle identification information in the form of precise dE/dx measurement for charged particles.

The structure of the CDC is shown in Fig. 2.6. The CDC covers  $17^{\circ} \leq \theta \leq 150^{\circ}$ , providing angular acceptance of 92 % of 4  $\pi$  in the  $\Upsilon(4S)$  rest frame. The inner and outer radii are 8 cm and 88 cm, respectively. The CDC consists of 50 sense wire layers and 3 cathode strip layers. The sense-wire layers are grouped into 11 superlayers, where 6 of them are axial and 5 are stereo super layers. The number of readout channels is 8,400 for anode wires and 1,792 for cathode strips. 50 % Helium - 50 % ethane ( $C_6H_6$ ) gas mixture is filled in the chamber to minimize the multiple-Coulomb scattering. A magnetic field of 1.5 Tesla is chosen to minimize momentum resolution without sacrificing efficiency for low momentum tracks.

From the result of beam test, the overall spatial resolution is 130  $\mu$ m and the transverse momentum resolution is  $(\sigma_{P_T}/P_T)^2 = (0.0019p_T)^2 + (0.0034)^2$ . The dE/dx measurements have a resolution for hadron tracks of  $\sigma(dE/dx) = 6.9\%$  and are useful for  $3\sigma K/\pi$  separation below 0.8 GeV. The CDC also is useful for  $4\sigma e/\pi$  separation.  $e/\pi$  separation below 1 GeV/c is very important for electron identification because the  $e/\pi$  method using the ECL is not effective in this momentum region.



Figure 2.6: Structure of central drift chamber.

The CDC trigger system uses signals from axial super-layers for the r- $\phi$  trigger. The z trigger is formed from the direct z information provided by the cathode strips and z coordinates inferred from the axial and stereo super-layers.

### 2.2.3 Aerogel Čerenkov counter (ACC)

The Aerogel Čerenkov counter (ACC) system extends the coverage for particle identification with the momentum  $p \ge 1.2 \text{ GeV}/c$ , the upper limit of the Time of Flight system, from the kinematic limit of 2-body *B* decays such as  $B^0 \to \pi^+\pi^-$ , to  $p = 2.5 \sim 3.5 \text{ GeV}$ , depending on the polar angle.

The aerogel of the ACC is made of SiO<sub>2</sub>. The refractive index of the aerogel is chosen so that the pion produces Čerenkov light in the aerogel while the kaon does not. In general, the threshold of the Čerenkov light emission in the matter with the refractive index of n is represented using the velocity of particle  $\beta$  as follows:

$$n > 1/\beta = \sqrt{1 + (m/p)^2},$$
 (2.2)

where the particle momentum p is measured by the CDC.

Each aerogel counter module consists of silica aerogel radiator module and finemesh photo multiplier tubes to detect Čerenkov radiation. The typical aerogel module comprises aerogel tiles contained in a 0.2-mm-thick aluminum box.

The ACC is divided into two parts. A barrel array (BACC) covers an angular range of  $34^{\circ} < \theta < 127^{\circ}$  and a forward end-cap array (EACC) covers an angular range of  $17^{\circ} < \theta < 34^{\circ}$ .

The BACC provides  $3\sigma K/\pi$  separation in the momentum region  $1.0 < p_K < 3.6$  GeV. The BACC consists of 960 aerogel counter modules. Five different indices



Figure 2.7: The configuration of the ACC.

of refraction, n = 1.01, 1.013, 1.015, 1.020 and 1.028 are used depending on the polar angle. Each barrel counter is viewed by one or two fine-mesh photomultiplier tubes (FM-PMTs).

The EACC consists of 288 modules of which the refraction index equals to 1.03. This eliminates the need for the TOF system in end-cap region, since this reflection index gives  $3\sigma K/\pi$  separation in the momentum range form 0.7 to 2.4 GeV/c. This approach provide complete endcap flavor tagging, as well as particle identification for many of the few-body decays relevant to CP eigenstates.

### 2.2.4 Time/Trigger of Flight Counter (TOF)

The relation between the measured flight time T, and the particle momentum p measured by the CDC is as follows:

$$T = \frac{L}{c}\sqrt{1 + (m/p)^2},$$
(2.3)

where L is the flight length depends on the TOF geometry and m is the particle mass. Because of the mass difference between kaons and pions, the difference of T between kaons and pions is ~ 300 ps. The TOF has a time resolution of 95 ps and provides  $3\sigma K/\pi$  separation. Using this time resolution, the TOF counters can provide fast



Figure 2.8: The configuration of the TOF/TSC.

timing signals for the trigger system.

The TOF system comprises 64 barrel TOF/Trigger Scintillation Counter (TSC) modules. A TOF/TSC module consists of two trapezoid shaped 4-cm-thick counters and one 5-mm-thick TSC counter separated by a 2-cm gap as shown in Fig. 2.8. A coincidence between TSC and TOF counters rejects  $\gamma$  background and provides a clean event timing to the Belle trigger system.

The TOF is segmented into 128 in  $\phi$  sectors and readout by one FM-PMT at each end. TSC7s have 64-fold segmentation and are readout from only backward end by a single FM-PMT. The number of readout channels is 256 for the TOF and 64 for the TSC. Each modules are located at r = 120 cm. The TOF/TSC system covers an angle range of  $34^{\circ} < \theta < 121^{\circ}$ .

#### 2.2.5 Electromagnetic calorimeter (ECL)

The main purpose of the electromagnetic calorimeter (ECL) is the detection of photons from B meson decays with high efficiency and good resolution. Most of the physics goal of Belle experiment require reconstruction of exclusive B meson final states. For typical B meson decay approximately one third of the final state particles are  $\pi^{0}$ 's, thus it is important to have photon detection capabilities that match those for charged particles, especially for low energy photon.  $\pi^{0}$  mass resolution is dominated by the photon energy resolution. Sensitivity to and resolution of low energy photons are the critical parameters for the efficient  $\pi^{0}$  detection.

Electron identification in Belle relies primarily on a comparison to the charged particle track momentum and the energy it deposits in the electromagnetic calorimeter.

#### BELLE CSI ELECTROMAGNETIC CALORIMETER



Figure 2.9: The configuration of the electromagnetic calorimeter.

Good energy resolution of the calorimeter results in better hadron rejection.

In order to satisfy these requirements, we chose a design of the electromagnetic calorimeter based on  $CsI(T\ell)$  crystal. All  $CsI(T\ell)$  crystals are 30 cm (16.1 radiation length) long, and are assembled into a tower structure pointing near the interaction point. The barrel part of the ECL has 46-fold segmentation in  $\theta$  and 144-fold segmentation in  $\phi$ .

The forward(backward) endcap part of the ECL has 13-(10-)fold segmentation in  $\theta$  and the  $\phi$  segmentation varies from 48 to 144 (64 to 144). The barrel part has 6,624 crystals and the forward(backward) endcap part has 1,153(960) crystals. Each crystal is readout by two 10 mm×20 mm photo-diodes. Total readout channel is 17,472. The inner radius of the barrel part is 125 cm. The forward(backward) endcap part starts at z=+196 cm(-102 cm).

The raw signals from each CsI counter are combined through the readout electronics to form an analog sum for the Level-1 trigger.

#### 2.2.6 Solenoid magnet

The magnetic field causes the charged particles to follow a helical path. Its curvature is related to the momentum of the particles. The coil consists of a single layer of an aluminum-stabilized superconductor coil, a niobium-titanium-copper alloy em-



Figure 2.10: The barrel and endcap parts of the Belle KLM.

bedded in a high purity aluminum stabilizer. It is wound around the inner surface of an aluminum support cylinder. Indirect cooling is provided by liquid helium circulating through a single tube welded on the outer surface of the support structure. A super conducting solenoid magnet provides a magnetic field strength of 1.5 Tesla in a cylindrical volume of 3.4 m in diameter and 4.4 m in length. The field value in the CDC volume is expected to vary by 2.0 %.

### **2.2.7** $K_L$ and Muon Detector (KLM)

The KLM detects  $K_L$  and muon and measures their position. The detection of  $K_L$  is needed to reconstruct  $B \to J/\psi K_L$ . Muons are used in the *CP* violation measurements to identify the flavor of *B* mesons and to reconstruct  $J/\psi \to \mu^+\mu^-$ .

It consists of octagonal barrel and two endcaps which are a sandwich structure of 14 iron plates of 4.4 cm thick and 14 (15 for barrel part) layers of 4.7 cm thick RPC (Resistive Plate Counter). The KLM system covers an angle range  $25^{\circ} < \theta < 145^{\circ}$ . In the case of  $K_L$ , the hadron shower occurs in its iron plates, however in the case of muon, it does not occur.

Raw output signals from each superlayer are sent to readout boards, located at the Belle detector, to discriminate noise signals and then barrel z and end-cap  $\theta$  readouts are used for the triggering purpose.



Figure 2.11: Belle trigger system.

#### 2.2.8 Trigger System

The total cross sections and trigger rates at the luminosity of  $10^{34}$  cm<sup>-2</sup>s<sup>-1</sup> for various physical process of interest are listed in Table 2.1. We need to accumulate samples of Bhabha and  $\gamma\gamma$  events to measure the luminosity and to calibrate the detector responses. However, since their rates are very large, these trigger rates must be prescaled by a factor ~ 100. Because of their distinct signatures, this should not be difficult. Although the cross section for physics events of interest is rather small, they can be triggered by appropriately restrictive conditions.

The Belle trigger system consists of the Level-1 hardware trigger and the Level-3 software trigger. Fig. 2.11 shows the schematic view of the Belle Level-1 trigger system. It consists of the sub-detector trigger systems and the central trigger system called the Global Decision Logic (GDL) [17]. The sub-detector trigger systems are based on two categories: track triggers and energy triggers. The CDC and TOF are used to yield trigger signals for charged particles. The ECL trigger system provides triggers based on total energy deposit and cluster counting of crystal hits. These two categories allow sufficient redundancy. The KLM trigger gives additional information on muons and the EFC triggers are used for tagging two photon events as well as

Bhabha events. The sub-detectors process event signals in parallel and provide trigger information to the GDL, where all information is combined to characterize an event type. Information from the SVD has not been implemented in the present trigger arrangement.

Considering the ultimate beam crossing rate of 509 MHz ( $\sim 2$  ns interval) with the full bucket operation of KEKB, a "fast trigger and gate" scheme is adopted for the Belle trigger and data acquisition system. The trigger system provides the trigger signal with the fixed time of 2.2  $\mu$ s after the event occurrence. The trigger signal is used for the gate signal of the ECL readout and the stop signal of the TDC for the CDC. Therefore, it is important to have good timing accuracy. The timing of the trigger is primarily determined by the TOF trigger which has the time jitter less than 10 ns. The ECL trigger signals are also used as timing signals for events in which the TOF trigger is not available. In order to maintain the 2.2  $\mu$ s latency, each subdetector trigger signal is required to be available at the GDL input by the maximum latency of 1.85  $\mu$ s. Timing adjustments are done at the input of the GDL. As a result, the GDL is left with the fixed 350 ns processing time to form the final trigger signal. In the case of the SVD readout the TOF trigger also provides the fast Level-0 trigger signal with a latency of  $\simeq 0.85 \ \mu s$ . The Belle trigger system, including most of the sub-detector trigger systems, is operated in a pipelined manner with clocks synchronized to the KEKB accelerator RF signal. The base system clock is 16 MHz which is obtained by subdividing 509 MHz RF by 32. The higher frequency clocks, 32 MHz and 64 MHz, are also available for systems requiring fast processing.

The Belle trigger system extensively utilizes programmable logic chips, Xilinx Field Programmable Gate Array (FPGA) and Complex Programmable Logic Device (CPLD) chips, which provide the large flexibility of the trigger logic and reduce the number of types of hardware modules.

#### 2.2.9 Data Acquisition System

The global scheme of the system is shown in Fig. 2.12. All detectors except for SVD are read out by TDC based unified readout system. Analog signals from CDC, ACC, TOF, EFC, and ECL are digitized by Q-to-T conversion; the charge of the detector signal is converted to a pulse of which the time width is proportional to the pulse charge. We can measure both the timing and the height by only one channel of TDC

The entire system is segmented into 7 subsystems running in parallel, each han-

| Physics process                                                          | Cross section (nb) | Rate (Hz)       |  |  |
|--------------------------------------------------------------------------|--------------------|-----------------|--|--|
| $\Upsilon(4S) \to B\overline{B}$                                         | 1.2                | 12              |  |  |
| Hadron production from continuum                                         | 2.8                | 28              |  |  |
| $\mu^+\mu^-+\tau^+\tau^-$                                                | 1.6                | 16              |  |  |
| Bhabha $(\theta_{lab} \ge 17^{\circ})$                                   | 44                 | $4.4^{(a)}$     |  |  |
| $\gamma\gamma~(\theta_{lab} \ge 17^{\circ})$                             | 2.4                | $0.24^{(a)}$    |  |  |
| $2\gamma$ processes ( $\theta_{lab} \ge 17^{\circ}$ ; $p_t \ge 0.2$ GeV) | $\sim 15$          | $\sim 35^{(b)}$ |  |  |
| Total                                                                    | $\sim 67$          | $\sim 96$       |  |  |

Table 2.1: Total cross section and trigger rates with  $L = 10^{34} \text{cm}^{-2} \text{s}^{-1}$  from various physics processes at  $\Upsilon(4S)$ . Superscript (a) indicates the values pre-scaled by a factor 1/100 and superscript (b) indicates the restricted condition of  $p_t \ge 0.3 \text{ GeV}/c$ .

dling the data from sub-detector. At the detector sub-system, the analog signals from each detector is converted to time information in some way like QTC. The final trigger asserted by the Global Decision Logic (GDL) is distributed to each detector sub-system by the sequence controller. Data from each subsystem are combined into a single event record by an event builder, which converts "detector-by-detector" parallel data streams to an "event-by-event" data river. The event builder output is transferred to an online computer farm, where another level of event filtering is done after fast event reconstruction (called as the "Level-3 trigger"). The data are then sent to a mass storage system located at the computer center via optical fibers.

A typical data size of a hadronic event by  $B\overline{B}$  or  $q\overline{q}$  production is measured to be about 30 kB, which corresponds to be the maximum data transfer rate of 15 MB/s.



Figure 2.12: Schematic view of the Belle DAQ system.

## Chapter 3

## Upgrade of the SVD

Althouh we achived the first discovery of the CP violation in the neutral B meson system in 2001 using our silicon vertex detector, we are upgrading it to obtain further better performance. The detector design is alternated (1) to improve the impact parameter resolution by increasing the number of detection layers and minimizing the radius of the innermost layer, (2) to increase the vertex detection efficiency by increasing the angular acceptance from  $23^{\circ} < \theta < 139^{\circ}$  to  $17^{\circ} < \theta < 150^{\circ}$ , and (3) to expand the detector lifetime against the background radiation by using radiation-hard readout chips [18].

This chapter gives a brief description of the SVD2 configuration and the readout system of it.

## 3.1 SVD2 Detector

Figure 3.1 and 3.2 show r-z and r- $\phi$  views of the SVD2, respectively, arround the interaction point (IP).

The SVD consists of double-sided strip detectors (DSSDs), in which an electric charge is induced by the particle hit. Four layers are formed concentrically with the beam pipe by DSSD array as illustrated in Fig. 3.2, where the array consisting of one to six DSSDs forms depending on the layer. Both ends of the DSSD array are connected to the frontend readout and trigger chips, VA1TAs, via flexible printed circuit boards.

In the following subsections, we describe the mechanical configuration and the components of the SVD2.



Figure 3.1: The r-z view of the SVD2.



Figure 3.2: The  $r-\phi$  view of the SVD2.

### 3.1.1 Mechanical Structure of the SVD2

The mechanical structure of the SVD2 consists of four layers of DSSD ladders, end rings, support cylinders, and an outer cover. Table 3.1 summarizes the geometry of each layers. The innermost layer has a radius of r = 2.0 cm, while the outermost layer rests at r = 8.8 cm. The increase of the number of detection layers and minimization of the innermost radious of the layer gain the impact parameter resolution by 30%. The angular coverage of the SVD2 is increased from  $23^{\circ} < \theta < 139^{\circ}$  of the SVD1 to  $17^{\circ} < \theta < 150^{\circ}$  to improve the vertex detection efficiency. The increase of the acceptance has another advantage in the particle tracking by the CDC, because the support material of the SVD ladders are moved out of the CDC tracking volume. The



Figure 3.3: The schematic drawings of the ladder. Before (upper) and after (lower) assembly.

forward and backward support cylinders are connected by the outer cover, and they form the basic mechanical stucture of the SVD2. The DSSD ladders are mounted on the forward and backward end rings, which are supported by the forward and the backward support cylinders. The support cylinders are in turn supported by the CDC end plates. The precision of the machining and assembly is better than 50  $\mu$ m, and the position of the DSSDs is measured with a precision of 10  $\mu$ m. The beam pipe is also supported by the CDC end plates. The beam pipe support is designed such that the heat load and any vibrations originating from the pipe and its cooling system should not affect the performance of the SVD system.

Table 3.1: Inner radius, number of DSSDs in  $\phi$ , and number of DSSDs in z for each layer.

| Layer | r (mm) | $N_{\phi}$ | $N_z$ |
|-------|--------|------------|-------|
| 1     | 20.0   | 6          | 2     |
| 2     | 43.5   | 12         | 3     |
| 3     | 70.0   | 18         | 5     |
| 4     | 88.0   | 18         | 6     |

#### 3.1.2 Double-Sided Strip Detector

Strip detectors are in principle large area diode divided into the narrow strips, each of which is read out by a separate electronic circuit. The detector consists of a highly doped  $p^+$  region on a low doped  $n^-$  substrate, the backside of a highly doped  $n^+$  layer. Usually, a reversed bias is applied to fully deplete the substrate, making the sensitive area wider and the number of produced charge greater.

The charged particles passing through the detector ionize atoms in the depletion region to produce electron-hole pairs. The generated electrons and holes are separated by the strong electric field and the electrons (holes) drift towards  $n^+$  ( $p^+$ ) electrode. The position of the charged particle is given by the location of the strip carrying the signal. The signal from the detector is read out as amount of the charge collected in the electrode. It is converted into the voltage by a charge amplifier and sent to the ADC.

The nominal thickness of a silicon strip detector is 300  $\mu$ m and the required energy to produce an electron-hole pair is 3.6 eV in the silicon. A minimum ionizing particle deposits about 80 keV of its energy into a 300  $\mu$ m thick silicon detector and creates about 22000 electron-hole pairs.

The single-sided strip detectors described above make use of only one type of the charge carrier, usually holes. By dividing the backside  $n^+$  layer into the strips and using the electrons collected there, a second coordinate can be read out from the same wafer (Fig 3.4). This is the principle of the double-sided silicon strip detector (DSSD). Since two-dimensional information can be obtained by one silicon wafer, we can reduce the effects of the multiple scattering by using the DSSD.

The DSSDs for the inner 3-layers consist of 512  $\phi$ -strips with 50  $\mu$ m pitch and



Figure 3.4: Schematic drawing of the double-sided silicon strip detector (DSSD).

1024 z-strips with 75  $\mu$ m pitch. The DSSDs for the outermost layer consist of 512  $\phi$ -strips with 65  $\mu$ m pitch and 1024 z-strips with 73  $\mu$ m pitch.

### 3.2 SVD2 Data Acquisition System

Figure 3.5 shows a block diagram of the overall readout system. The system is physically distributed over four sites:

- 1. front-end readout chips, VA1TAs, mounted hybrids located in close proximity to the DSSDs;
- 2. a repeater system located just outside the final quadrupoles about 2 m from the IP;
- 3. flash analog-to-digital converters (FADCs), a PC farm for data reduction, and event builder converting subdetector-by-subdetector data record into event-byevent data stream located in the electronics hut, which is an approximately 35 m cable run from the detector.



Figure 3.5: Schematic drawing of the SVD readout system.

### 3.2.1 Front-end Readout Chip

Signals from each side of DSSDs are read out by electronics comprising VA1TA front-end integrated circuits mounted on ceramic hybrids. Each hybrid holds four 128-channel VA1TA chips.

The VA1TA augments the VA1, which has been used for the SVD1 readout, by providing prompt digital output for triggering and by strengthening against the radiation damage. Within 75 ns after the event occurence, the VA1TA generates an prompt trigger output. All trigger outputs are combined in a trigger logic for the SVD2 (L0T). At the L0T, a simple tracking is performed using a memory lookup technique. The trigger decision by the L0T is fed back to the VA1TA for self triggering. The radiation tolerance is one of the most importance upgrades in the SVD2. Although the VA1 remains functional to 200 kRad, it exhibits a significant increase in noise at lower levels. Moreover, the 200 kRad level affords almost no margin for error in accelerator and detector operations. The inner layers of the first detector of the SVD1 were severely damaged by synchrotron light during summer of 1999 and had to be replaced. The VA1TA is made harder against the radiation damage by reducing the thickness of gate-oxide. We can expect more than 20 MRad radiation hardness for the VA1TA with AMS 0.35  $\mu$ m process. The number of channel of VA1TA is 110,592 in total.

#### 3.2.2 Repeater System

The repeater system provides full control of the SVD front-end electronics and takes care of transmitting the analog signals from the multiplexers on the VA1TA chips on the hybrids to the FADCs in the electronics hut. The repeater system also distributes the detector bias and hybrid power-supply voltages.

The system comprises two board designs: a motherboard (MAMBO) and a repeater board (REBO). Each of the ten repeater-chassis made of copper includes six identical REBOs plugged into a single MAMBO. Three of the REBOs in each repeater-chassis are used for positive-bias detectors and the other three are used for negative-bias detectors. The MAMBOs handle the distribution of positive and negative detector bias voltages. The MAMBOs also receive and distribute control signals from the off-detector electronics.

Overall steering of the fast trigger and control signals is done using trigger-timing modules (TTMs) housed in the electronics hut, while slow control steering is done using an RS485 VME module, also located in the electronics hut.

#### 3.2.3 Flash Analog-to-Digital Converter

Flash analog to digital converter, FADC, is equipment that digitizes analog signals of SVD [19]. It receives analog signals from VA1TA via REBO. After digitization, the data are passed to FIFO inside the FADC, and then sent to PCs through LVDS cables. In this subsection, we explain detail mechanism of the FADC, and data format of FADC output.

#### 3.2.3.1 FADC Function Explanation

A block diagram of the FADC is shown in Figure 3.6. The FADC has six RJ45 connectors, each of which is connected to individual REBO. Fast-or pulse corresponding to the L0 trigger and analog VA1TA output from the FADC arrives at the RJ45 connectors via four pairs of differential cables. Each REBO outputs serialized data from four VA1TAs, and one FADC receives output from 24 VA1TA chips in total. Signals from one RJ45 connector are digitized in two DB\_2xADc daughter boards of the FADC, which have ADC chip each with 10-bit accuracy. In the ADC chip, signal digitization is performed according to 20MHz clock fed by the FADC, but only every fourth value is sampled and others are thrown away. Thus the ADC output is in 5MHz. Then, the signals are sent to a four-event FIFO (DAP unit). The FADC has six four-event FIFOs in parallel, where the outputs from each FIFO are merged in a "final memory". The final memory is a two-event FIFO (consequently, the FADC has six-event capacitor in its data stream), which transferes data to the PPC module.



Figure 3.6: The block diagram of the FADC board.

#### 3.2.3.2 Data Format of FADC Output

Figure 3.7 shows the data format of the FADC output of a single word (4 bytes).

| 31 30 29 28 27 26 | 25 24 23 22 21 20 19 18 17 16               | 15 14 | 13 12 11 10 | ) 9 | 8 7 | 65 | 4 3 | 2 1 | 0 |
|-------------------|---------------------------------------------|-------|-------------|-----|-----|----|-----|-----|---|
| 123               | 4                                           | 56    | 7           |     |     | l  | 8   |     |   |
|                   |                                             |       |             |     |     |    |     |     |   |
| 1:                | Parity Bit                                  |       |             |     |     |    |     |     |   |
| 2 :               | Reserved (set to 0)                         |       |             |     |     |    |     |     |   |
| 3 : Event Number  |                                             |       |             |     |     |    |     |     |   |
| 4 :               | 4 : FADC Counts (Former half of VA1TA Chip) |       |             |     |     |    |     |     |   |
| 5 :               | Stop Flag                                   |       |             |     |     |    |     |     |   |
| 6 :               | Start Flag                                  |       |             |     |     |    |     |     |   |
| 7 :               | Chip Number                                 |       |             |     |     |    |     |     |   |
| _                 |                                             |       |             |     |     | -  |     |     |   |

8 : FADC Counts (Later half of VA1TA Chip)

Figure 3.7: The data format of FADC output. The numbers above the bar (from 0 to 31) are bit numbers and numbers in the bar are the content tag.

The upper half word (bit#16 to bit#31) includes 10-bit ADC count from the VA1TA chip with chip-id of #12 to #23 and lower half word (bit#0 to #15) includes the one from the VA1TA chip with chip-id of #0 to #11, where the chip-id ranges from #0 to #23. Remaining 12 bits but one bit reserved for future use are used for further data quality check by the CPU: a parity (#31), event number counter (#26..#30), stop bit (#15) start bit (#14), and VA1TA chip-id (#10..#13). Event number is count of event since run begins. Parity bit is error check flag of one record. Stop bit is set for the last record of each chip data. Start bit is set for the first record of each chip data. The chip number is VA1TA id of smaller one (not #12 but #0 in this case).

One should repeat the read action 128 times to read whole data from one VA1TA chip, because one VA1TA chip involves in 128 strip data. From the 129th read, the ADC counts for the VA1TAs with chip-id #13 and #1 are read. Since the FADC outputs the data for 24 VA1TA chips, the total data size of a certain event will be  $(4 \times 128 \times 24)/2 = 6144$  bytes. Figure 3.8 shows data format of a single event.

#### **3.2.4** PC Farm

It is required to reduce the data size of the SVD, due to the limited storage capacity and bandwidth from the experimental hall to the storage system located at the computer center. We perform the data reduction using the comercial PCs.



Figure 3.8: Data format of the FADC output of a single event.

The detector signals digitized by the FADCs are transferred to the PC farm, which consists of twelve PCs, in the electronics hut. We use PCI board as an interface for the FADC. The SVD data reduced by the PCs are sent via TCP/IP to an event builder at which data from all detector components are gathered.

The details of the PCI interface board and the data processing in the PC farm are described in Chapter 4 and 5, respectively.
## Chapter 4

## **PCI** Board

We use a PCI board as an interface for the FADC, where the data is transferred from the FADC. The PCI board is provided by Fird Corporation [20]. We call this PCI board PPCI [21] hereafter. The PPCI must have ability to catch up with high rate of the FADC data sending. We show the requirements for PPCI in section 4.1. Hardware design of the PPCI is given in the section 4.2. Transfer protocol between the FADC and the PPCI is written in the section 4.3. We measure the data recieving speed of the PPCI, which is described in the section 4.4. We also perform an error rate test between the PPCIs to confirm the reliability of the PPCI data transfer in Section 4.5. Finally, we conclude the PPCIs qualification for use in the SVD2 in Section 4.6.

## 4.1 Requirements for PPCI

One PPCI board is connected to one FADC. The Data size of a single event from FADC is constant (6 kbytes) as shown in Section 3.2.3.2. Based on the background study, the trigger rate at  $L = 10^{34} \text{cm}^{-2} \text{s}^{-1}$  is estimated to be ~630 Hz (310 Hz) from the positron (electron) beam. Thus our target for the readout is to handle total 1kHz of triggers; a single PPCI board must handle 6 Mbyte data per second.

Here we summarize the requirements imposed to the PPCI:

- $\star$  PPCI must catch up with 1 kHz trigger rate.
- $\star$  PPCI must handle 6 Mbyte data per second.

### 4.2 Hardware Description of PPCI

The PPCI is manufactured by Fird Corporation. A picture of this board is shown in the Figure 4.1.



Figure 4.1: The picture of the PPCI-LV board.

The external connector of the PPCI is a 96-contact half-pitch connector, through which FADC sends data with the LVDS level. The PPCI is originally designed to be used in TTL level. Because the distance between FADC and PPCI have to be more than 5 meters, we use LVDS level which is robust against noises. According to Fird Corporation, the data transfer using this board with LVDS is guaranteed up to 10 meters in length. Data are transferred in 32 bits in parallel, whose sampling timing is provided by 20 MHz external clock generated by the FADC. The PPCI is equipped with 8k byte FIFO.When the CPU is busy, the PPCI requests the FADC to stop the data transfer. After that, the FADC could still send some more data because of the delay of the signal transmission. The data are kept in the FIFO and not lost even in such a case.

### 4.3 Transfer Protocol between FADC and PPCI

Four control lines, 32 data lines and one clock line are used in the data transfer between the FADC and the PPCI. Four control lines are used for the protocol between the PPCI and the FADC, and 32 data lines for transfering the digitized data from the FADC. The protocol signals in four lines are called XREQUEST, XVALID, XREADY and XENABLE. The clock signal is called XCLOCK and data lines are called XDATA. The first two signals are sent from the FADC to the PPCI, and the last two are from the PPCI to the FADC. XCLOCK is sent from the FADC to the the PPCI. Data line signals are latched at the rising edge timing of XCLOCK. An overview of these signals is shown in the Figure 4.2.



Figure 4.2: The signals for between the FADC and PPCI. The clock signal is XCLOCK and data lines are XDATA. XREQUEST and XVALID are sent from the FADC to the PPCI, and XREADY and XENABLE are from the PPCI to the FADC.

XCLOCK is the 20 MHz clock signal generated by the FADC. It is the same clock used for the AD conversion in the FADC. XREQUEST is the request for reading from the PPCI when there exist data in the FADC final memory. XREADY and XENABLE indicate whether PC can receive data. XVALID is asserted, when the data on XDATA are guaranteed. XDATA are latched by the XCLOCK.

Timing chart at the beginning of data transfer from the FADC to the PPCI is shown in the Figure 4.3. As soon as data come to the final memory in the FADC, XREQUEST is asserted. If PC is ready to receive the data, both XENABLE and XREAD are asserted. Then, XVALID is asserted, and the data transfer begins. XCLOCK is continuously sent to the PPCI. While XVALID is asserted, data is sent through the data lines.

Timing chart at the terminartion of the data transfer by the FADC is shown in Figure 4.4. When the final memory in the FADC becomes empty, XVALID is



Figure 4.3: The timing chart of the beginning of the data transfer between the FADC and the PPCI.



Figure 4.4: The timing chart at the termination of the data transfer by the FADC between the FADC and the PPCI.

negated to indicate the idle state, and the data transfer is completed. In following that, XREADY and XENABLE are negated.

Not only the FADC, but also the PPCI can stop the data transfer. Timing chart at the termination of the data transfer by the PPCI is shown in Figure 4.5. The PPCI negates XREADY and XENABLE to request the FADC to stop the data output. The FADC still send some more data until the FADC accepts the request. After that, the FADC assert XVALID and finish data transfer. Those data are stored in the FIFO in the PPCI.



Figure 4.5: The timing chart at the termination of data transfer due to the the PPCI busy between the FADC and the PPCI.

## 4.4 Measurement of Data Receiving Speed with PPCI

The required speed for the PPCI board is estimated at least 6 MBytes/sec per board. To ease the maintenance and to reduce the costs and space, we need to keep a number of PCs to be small and increase a number of PPCIs per PC as many as possible. In this section, we study the relation between the number of PPCIs installed in one PC and the data receiving speed per the PPCI board.

We measure the receiving speed of the PPCI varying the number of the installed PPCI from one to four, where one PC has four PCI slots for the PPCI.

Figure 4.6 shows the test configuration. We use another PPCI as the data sender instead of FADC. Only one PPCI is installed in a sender PC to reduce the CPU and bus load of the PC. Received data are not checked or processed so as to make no



Figure 4.6: The schematic drawing of the test bench for the transfer speed test. The PPCI, instead of the FADC, sends data, and the other PPCI recieves the data. Only one PPCI is installed in a sender PC to reduce the CPU and bus load of the PC. The sender and the receiver PPCI is connected with a 5-meter LVDS cable.

load except for the data receiving program in the receiver PCs. The sender and the receiver PPCI is connected with a 5-meter LVDS cable.

The test result is shown in Figure 4.7. The data recieving speed of the PPCI without any PPCI in the neighboring slot is 39.6 MB/s. When the number of installed PPCI is two, total receiving speed is almost doubled. However, when the number is more than two, total speed does not show scalability to the number of the PPCIs. The receiving speed per one PPCI board decreases as the number of installed PPCIs increases. Nevertheless, even if four PPCIs are installed in one PC, the measured receiving speed is 20.6 MB/s and it satisfies the required speed of 6 MB/s, which is indicated by dashed line in the figure.

## 4.5 Test for Error Rate

In this section, we check (1) the error rate of data transfer through the LVDS cable within the PPCI by comparing the received-data image to the sent-data image to test the stability of the PPCI hardware, and (2) the transaction of boundary of each read by the PPCI to detect the possible fault in the data transfer protocol between the FADC and the PPCI that can cause missing and/or extra data at the boundary of each read action by the PPCI.

First we explain the error rate test. This test is performed separately from that of



Figure 4.7: The data receiving speed as a function of the number of PPCI boards. Solid line and the dashed line show the speed per PPCI and the total speed, respectively. The study is performed without any data processing in the receiver PC to avoid possible latency. Even four PPCIs are installed in the PC, the PPCI itself still clears the requirement of 6 MB/s, which is indicated by the horizontal line.

the data transfer speed, because the speed can be affected by the process load with checking the data compatibility.



Figure 4.8: The schematic drawing of the test bench for the error test of the PPCI. Three PPCIs are installed in one PC, where each PPCI is connected to another PPCI installed in another PC with a 5-meter LVDS cable.

The test setup is shown in the Figure 4.8. Three PPCIs are installed in one PC. Each PPCI is connected to another PPCI installed in another PC with a 5-meter LVDS cable. We make dummy events in the sender PC and send them to the receiver PPCI from the sender PPCI. The data size of each event is configured to be 6144 bytes, which is the length of the real FADC output.

The test is carried out using two different types of data format. First format is similar to the FADC output as described in the section 3.2.3.2. This format is used for simulating the real use in SVD2. However, this format has a constant bit field. (reserved bit (1-bit) shown in 3.2.3.2). In order to check all bits, we use the other format, whose data are artificially generated and use all bits including the unused field in the real FADC format. Because the data contents are fixed beforehand, the receiver process can detect the error in the data by comparing the received data to the expected data to be received.

We send 6144 byte data 5,400,000 times (33GB in total) that correspond to 1 kHz trigger rate and 90 minutes duration, which are the typical event rate and run duration, respectively. We detect no error in this test; all received and send data are consistent.

Then, we test the transaction of boundary of read of the PPCI. Reading size by the receiver PPCI is varied to check the various fragmented data size in data transfer, since that may cause data missing and/or extra data at the boundary of each read action. The reading size is varied as follows: 396 bytes, 400 bytes, 6128 bytes, 6132 bytes, 6136 bytes, 6140 bytes, 6144 bytes, 6148 bytes, 6152 bytes, 6156 bytes, 6160 bytes, and in random. The size of dummy event is unchanged (6144 bytes) through all variations.

We check all combinations of two data formats and the eleven fragmentation sizes.  $(2 \times 11 = 22(patterns))$ . In each case, we send 6144 kbyte data for 5,400,000 times. We also find no error in this test.

The total transferred size in 22 transactions is 730G byte. Since we have no error, the error event rate is less than  $2.5 \times 10^{-8}$  at a 95% confidence level.

## 4.6 Performance Summary of PPCI

The data receiving rate of the PPCI is measure to be 24.8 MB/s when three PPCI boards are installed, and 20.6 MB/s when four PPCI boards are installed. The speed is measured to be much faster than the required speed of 6 MB/s. No error is detected after 730 GB data transfer with PPCI, and this result corresponds to the probability of the error event less than  $2.5 \times 10^{-8}$ .

We conclude that the PPCI board is fast and reliable enough to be used in the real experiments.

## Chapter 5

## Data Processing in PC

The SVD2 has 110,592 strips in total. The number of strips we are interested in, which are associated with charged tracks, is less than 100 in the real data. Since the other strip data are needless, we discard the data that have no concern with the real hit, and make data size as small as possible. Therefore, we have to carry out pre-hit finding at the online stage. This work must be processed faster than the expected trigger rate of 1 kHz so that we make no dead time by this work. In the SVD2, we use PCs to perform pre-hit finding instead of on HALNY that is a special hardware equipped with DSPs in the SVD1. Using the HALNY, it is required to be skilled in assembly language programming. On the other hand, data processing in PC is easier to improve the data acquisition code since it is programmed in C language. In this chapter, we explain the data processing in the PC.

The specification of the PC used for the data processing is shown in Table 5.1.

| Item     | Specification                                               |  |  |  |  |  |
|----------|-------------------------------------------------------------|--|--|--|--|--|
| CPU      | Intel<br>$\ensuremath{\mathbb{R}}$ XEONTM CPU 2.40 GHz Dual |  |  |  |  |  |
| Memory   | $256 \mathrm{MB}$                                           |  |  |  |  |  |
| OS       | Red Hat Linux release 7.3                                   |  |  |  |  |  |
| Compiler | Intel $( \mathbb{R} \ C++ \ Compiler \ Version \ 6.0 )$     |  |  |  |  |  |

Table 5.1: Specifications of PC Used for Data Compression

### 5.1 Overview of Data Process in a PC

Figure 5.1 shows the data flow in case that three PPCIs are installed in one PC. A square, an arrow, and an ellipse indicate a thread, a data stream, and a shared memory, respectively. In a single process in the PC, there are three data streams from the FADC via the PPCI. They are gathered and sent to the event builder. In order to have the threads communicate with each other, we use POSIX Thread. Those threads can cooperate in a single process.



Figure 5.1: The flowchart of the data processing in the PC in case that three PPCIs are installed. A square, an arrow, and an ellipse indicate a thread, a data stream, and a shared memory, respectively.

#### 5.1.1 Read from PPCI Thread

"Read from PPCI thread" reads FADC data through the PPCI, and writes them to a "double buffer". This thread is separated from "data process and reformat thread" so that we can read data from FADC as fast as possible without any block until completion of the data process. It can reduce the dead time caused by the overflow of FIFO on FADC.

#### 5.1.2 Double Buffer

Since the event rate follows Poisson distribution, much higher trigger rate than the mean value of 1 kHz can happen instantaneously. Even in such cases, we have to be able to take those events. There are FIFOs in the FADC, which can hold data up to six events. However, since six-event capacity is not enough to eliminate dead time, we prepare a buffer named "double buffer" in the PCs. This buffer can hold data up to 200 events.



Figure 5.2: The schematic drawing of how the double buffer works.

Figure 5.2 shows how the double buffer works. The double buffer has two internal buffers. Each internal buffer can hold data up to 100 events. During "read from PPCI thread" is writing to one buffer, "data process and reformat thread" is reading from another buffer. We switch internal buffer in the double buffer when one buffer becomes empty, from which "data processing and reformatting thread" is reading, to the other buffer.

#### 5.1.3 Data Process and Reformat Thread

"Data process and reformat thread" receives data from the double buffer, and then checks possible errors, processes data (pedestal subtraction and pre-hit finding), and reformats it to an offline format. In the error check routine, we check consistency of parity bit, start bit, stop bit, and VA1TA chip number. We also check event tag consistency through all FADC output. These processes are the heaviest load on the CPU, so we need to develop a faster calculation algorithm. Detail descriptions of the data processing and reformatting] are given later in Section 5.3.

#### 5.1.4 Merge Buffer

"Merge Buffer" is to store the compressed data. The purpose of this buffer is to separate "data process and reformat thread" from "merge event and send to event builder thread". This buffer can hold up to 10 events.

#### 5.1.5 Merge Event and Send to Event Builder Thread

"Merge Event and Send to Event Builder Thread" gathers data from the three merge buffers. After gathering those data, this thread checks the consistency of event counter with an internal counter. Gathered data are sent to the event builder through LAN cable using TCP/IP.

### 5.2 Explanation for Characteristic Values of FADC

In this section, we explain the pedestal, common mode noise, and intrinsic noise, which characterize the feature of strip signals. These quantities are used in the data processing.

Pedestal values are the mean ADC counts throughout all events being analyzed. As strips of p(n)-side make positive (negative) signals, and both signals have to be measured with a single ADC, pedestal values of around half of maximum ADC counts (about 512 counts) are desired. Each strip has its own characterized pedestal value that does not change so much throughout its operation.

In an individual VA1TA chip, a common offset is added onto the strip signal after the serialization by the FADC due to an external noise. We call this offset common mode offset. This offset varies event by event. The variation of this offset is a kind of noise induced by the external system.

When there is no hit signal, the pedestal- and common-mode-subtracted ADC counts still fluctuate around zero according to the Gaussian distribution function. This fluctuation, an intrinsic noise, is induced by the Johnson thermal noise, shoe noise, and 1/f noise. The intrinsic noise is defined as root-mean-square (RMS) of the standard deviation of the ADC counts. This value changes strip by strip.

Pedestals and common mode are calculated and subtracted in the data processing. True signal is identified in the pedestal- and common-mode-subtracted data by computing a ratio of signal to the intrinsic noise.

## 5.3 Data Processing and Reformatting

In this section, we explain the data processing and reformatting step by step. Figure 5.3 shows data processing procedures. First pedestals and common mode offsets are subtracted, then signal to noise ratio is calculated to search for signal hits, and finally data are reformed and packed into the offline format.



Figure 5.3: The flowchart of the data compression.

#### 5.3.1 Pedestal Subtraction

First, pedestals are subtracted from ADC counts. Pedestal values are updated at every 100 events. An updated pedestal value is determined by taking a weighted mean of the old pedestal value and the last ADC value. We apply the weight for the last ADC value to "a". We set a = 1/64 empirically.

#### 5.3.2 Common Mode Subtraction

In this stage, common mode is calculated and subtracted from the pedestalsubtracted data. Common mode is calculated for each VA1TA chip. Strips that have no real hit are used for the calculation of common mode value. We exclude strips with greater than  $3\sigma$  of the ADC counts with assuming them as hit strips. The common mode is an average of the ADC counts of the remaining strips.

#### 5.3.3 Signal to Noise Ratio Calculation

Signal count is defined as a count of the pedestal- and common-mode-subtracted data, which is shown in Figure 5.4.



Figure 5.4: The hatched area shows the fluctuation of the pedestal and common mode subtracted ADC counts. The RMS of the fluctuation is defined as the intrinsic noise. The signal ADC count is pedestal- and common-mode-subtracted data.

In strips that have any hit, there is a hit signal masked by the intrinsic noise. To find out the signal hits, we use signal to noise ratio, which is defined as the signal count divided by the intrinsic noise. The intrinsic noise value is updated at every 100 events. The signal to noise ratio should be 1 for strips without any hit, or we need to consider overestimation or underestimation of the intrinsic noise and should update the intrinsic noise value. If new signal to noise ratio is greater than 1, we increase the intrinsic noise value, or if new signal to noise ratio is less than 1, we decrease the intrinsic noise value, using the empirically determined rule.

#### 5.3.4 Signal Hit Search

Strips having signal hits are searched using the signal to noise ratio of strip. The angle between the particle momentum direction and DSSD has various patterns. As shown in Figure 5.5, a particle that penetrates with a smaller incident angle drops higher energy on a strip. In this case, a "cluster" consists of a single strip, where a "cluster" is defined as the consecutive strips that have signal hits. On the other hand, a particle with a largeer incident angle drops energy on many strips, then forms a wider cluster.

We set a threshold level for a single strip (*cluster size* = 1) to extract signal hits with the following formula:

$$(S/N)_1 \ge \alpha$$
 ( $\alpha$  is a constant value)

If we apply this threshold for a wide cluster, we would miss the cluster. It is desirable to take as wide varieties of hit cluster as possible. Therefore, we set a lower threshold for the wider clusters. To find out the wider clusters, we also check the sum of the neighboring strips. In the total value of the neighboring two strips, there is the contribution of the intrinsic noise, which is  $\sqrt{2}$  times of the single strip. We set a threshold level for a double strip cluster (*cluster size* = 2) as:

$$(S/N)_2 \ge \sqrt{2} \times \alpha$$

We thus search the clusters whose cluster size n is 1, 2, 3, 4, 6, or 8 applying the threshold level  $(S/N)_n = \sqrt{n} \times \alpha$ .

In our searching hits, we set  $\alpha = 4.0$  as the threshold value, empirically. We regard the strips that have higher S/N than the threshold level as hit strips. In our measurement, the intrinsic noise level is measured to be 2 to 5 ADC counts, depending on the strips, in SVD2.

#### 5.3.5 Data Reformatting

After the process of search hits, strip data selected with the above process are formatted to the offline format.

We show the offline data format in Figure 5.6. The formatted data consist of a header, strip hit data, pedestal and noise values, and a footer. The data size of header, pedestal and noise value, and footer are 24 bytes, 36 bytes, and 4 bytes, respectively. The data size of the strip hit data varies from 5 to 1024 bytes.

(a) Small incidence angle particle



Figure 5.5: A charged particle penetrating a DSSD with (a) a small incidence angle and (b) a large incident angle.

| 31 30 29 28 27 26 25 24 2 | 23 22 21 20 | 19 18 17 16 15 14 13 3 | 12 11 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
|---------------------------|-------------|------------------------|----------|---|---|---|---|---|---|---|---|---|---|
| 24 bytes                  |             | Header                 |          |   |   |   |   |   |   |   |   |   |   |
| variable (5~1024 I        | bytes)      | Strip Data             |          |   |   |   |   |   |   |   |   |   |   |
| 36 bytes                  | Pedes       | tal and Intrinsic      | : Nois   | е |   |   |   |   |   |   |   |   |   |
| 4 bytes                   |             | Footer                 |          |   |   |   |   |   |   |   |   |   |   |

Figure 5.6: The data packing format of one hybrid. The formatted data consist of a header, strip hit data, pedestal and noise values, and a footer.

### 5.4 System Test Environments

We check the performance of the data processing in the PC. Since the SVD2 detector is not installed in the Belle experiment yet, we make a test system similar to the real SVD2 experiment. The difference in the system between the real experiment and the test setup is the number of ladders. Since we can not use the real event builder, we develop our own event builder for the SVD2 system test.

Since we do not have all ladders, we use a part of ladders already assembled for this test. The arrangement of the ladders that we use in this system test is shown in Figure 5.7, where installed ladders are represented by filled boxes and vacant ladder slots are represented by open boxes.



Figure 5.7: Ladder and scintillation counter arrangement for the system test. Installed ladders are represented by filled boxes and vacant ladder slots are represented by open boxes.

To obtain the higher trigger rate than cosmic rays, we use an isotope  ${}^{90}$ Sr. We use a scintillation counter as a source of random trigger, on which the isotope is put. The scintillation counter is located above the SVD2 as shown in Figure 5.7. We set a certain threshold to the scintillation counter pulse height by a discriminator so as to obtain about 2 kHz trigger rate on an average.

### 5.5 Event Builder for System Test

Figure 5.8 shows the data acquisition system from the PC farm to the storage system for the system test.



Figure 5.8: The data acquisition system from the PC farm to storage system for the system test.

We develop an event builder only for the system test. For the real experiment we use the event builder provided by the central DAQ system.

Four PCs are connected to one network card via a hub. The network is used only to send reformatted data. This network is linked with LAN cables, and data is transferred on TCP/IP at 100base-T. There are three networks for twelve PCs in total. The event builder waits for the data from PCs. When reformatted data come from PCs, the event builder make sure that all event tags, which each reformatted data from PC have, are same and get together them. Gathered data are preserve to the disk space. A SCSI cable connect the event builder and disk space. Disk space has about 1 Tbytes in total.

We show the specification of PC used for the event builder in Table 5.2

### 5.6 Limitation of Event Builder

Total receiving rate from PCs increase in proportion to the number of sender PCs. We found that there is a limitation for the event builder to receive data from

| Item     | Specification                                 |
|----------|-----------------------------------------------|
| CPU      | Intel® Pentium® 4 CPU 2.00GHz                 |
| Memory   | 512MB                                         |
| OS       | Red Hat Linux release 8.0 (Psyche)            |
| Compiler | Intel $(\mathbb{R})$ C++ Compiler Version 6.0 |

Table 5.2: Specifications of the PC used for the event builder.

a PC via TCP/IP, which is very heavy load when event rate reaches 1 kHz (1.4 Mbytes/sec/PC).

Figure 5.9 shows the process speed as a function of the DAQ operation time with varying the number of the sender PCs 1, 4, 6, and 12. Figure 5.10 summarizes the processing speed in those cases, where the horizontal axis is the time after data acquisition begins. We write an inverse proportional line in the figure. Total data size that the event builder receives in one second is constant on this line. When the number of sender PCs is more than 4, the measured speed is on the line. Therefore we can say that this receiving data rate is constant. From this line, total file size that the event builder can receive is about 8 MB/sec. It tells us that the process speed including data transfer is limited by the event builder.

What we want to measure is the speed limitation of the PC processing, not of the event builder. Since the processing speed for the numbers of PCs of one and four are almost the same, we can expect that the process speed is nearly constant if the number of PCs, from which the event builder receives, is equal or less than 4. It is considered to be limited by the PC processing, and it means that the number of PCs does not affect in the process-speed measurements. We use PCs in same network (Figure 5.8) in case of four PCs. This means that the process-speed is not limited by the network, but by the load inside the event builder. So we measure the process-speed using four PCs.

### 5.7 Compression Ratio

We define the percentage of the number of strips, which have signal, to all strips as occupancy. We check the relation between the occupancy and the data size. Figure 5.11 shows relation between the output data size and the number of strips which have



Figure 5.9: Time Dependence of Data-Processing Speed



Figure 5.10: Data-Processing Speed in Various Number of Sender PCs

hits.

In this figure, one point corresponds to one event. This result shows that the output file size is linearly depend on the occupancy. The offset when occupancy is equal to zero is 5112 bytes in the system we use (one third system). This offset is the sum of all hybrid information in the reformatted data. We define the ratio of total output data size to data size from FADC as compression ratio. The total data size per an event is 15 kbytes, and it corresponds 6.9% compression ratio. When the averrage occupancy is 5%, output data size is 22 kbytes, which corresponds to 11.6% compression ratio.

The total data size of one event and the compression ratio for several sampled occupancies are summarized in Table 5.3.

### 5.8 Speed Test

We measure the process speed in low occupancy environment in the above section. We check the process speed by changing the occupancy. We cannot generate real signal hits in SVD2, so we measure the process speed by changing the threshold



Figure 5.11: Compression Data Size

Table 5.3: Corresponding table between of occupancy and compression ratio.

| Occupancy (%) | Total Data Size per an Event (kByte) | Compression Ratio (%) |
|---------------|--------------------------------------|-----------------------|
| 0.0           | 15.0                                 | 6.9                   |
| 2.0           | 19.4                                 | 8.8                   |
| 5.0           | 25.6                                 | 11.6                  |
| 8.0           | 31.7                                 | 14.4                  |

levels.

Figure 5.12 shows the result. The processing speed depends on the occupancy. We fit those results with a function  $f(x) = \frac{1}{(a \times x+b)}$ . (x is the occupancy, and a, b are constant) "a" is a factor which is increase by increasing occupancy, and "b" is other process time. We take four pattern data. In the real experiments, the occupancy is about 5 percent in SVD1. In SVD2, the occupancy is expected to increase, because the most inner layer of SVD2 becomes nearer to the beam pipe.

Since data processing program have enough FIFO (detail is written in section 5.1.2) inside itself, the instantaneous high trigger can absorb. If the averrage processing speed is more than trigger rate, data processing makes no dead time but for data reading rate limitation. In the case when the processing rate less than trigger rate, deal time occurs and make missing event. When the processing rate is less than trigger rate, the difference between processing rate and trigger rate is missing event rate.

Expected trigger rate is 1 kHz. Then we conclude that our data taking system will be able to operate about 8% occupancy level with processin three FADC data in a PC. The occupancy is more than the value we measure, the number of processing FADC in PC can decrease to avoid the missing event.

#### 5.9 Stability Test

We carried out long run test under low trigger rate. High trigger rate DAQ makes very high rate of output data. So long-term high trigger rate operation needs very large storage. Since we have to do the long-term test in the limited storage, we do the test in low trigger rate. The difference from the previous test is the trigger source and the number of PCs, which sent data to the event builder.

We use the cosmic ray muon events. We put a scintillation counter above SVD2, and put another scintillation counter to sandwitch the SVD2. We take coincidence of the two scintillation counters, and use it as trigger source. Trigger rate is about 2.5 Hz.

At low rate DAQ, we can use the full system because data traffic is small so that the event builder is not limited by the heavy transfer rate.

We operate the DAQ system for one and half days (36 hours), and we obtain 37 million event data without any problems.

Figure 5.13 and 5.14 show the alteration of pedestal value and intrinsic noise value



Figure 5.12: Process speed in PC

in the long run test. (a) is when pedestal and intrinsic value come to stable. (d) is at the end of long run. (b) and (c) are intervals. These two value are stable around the test run. As pedestal and intrinsic noise values are calculated inside the data processing program, these graphs show that data processing program works stably.

## 5.10 Summary and Discussion of Data Processing in PC

Data processing speed in PC is measured under the case when data from three FADC in one PC. Now the occupancy is about 5%, but most inner layer of SVD2 is 1.5 times closer ( $3.0 \text{cm} \rightarrow 2.0 \text{cm}$ ), we assumed that occupancy level is 1.5 times (8%) at worst case. concidering beam background. When the occupancy is 8%, data processing speed is 1,1 kHz. It means that data processing is operated in processing three FADCs. If the occupancy is more than 8%, operation in this configuration cause dead time. Since data processing speed is proportional to the amount of data processed. In such case, the number of FADC one PC process makes less, it makes



Figure 5.13: Pedestal Alteration in Long Run Test



Figure 5.14: Intrinsic Noise Alteration in Long Run Test

faster data processing.

We estimate the occupancy in the SVD2. As summarized in Table 5.4, in the SVD1, the occupancies of layer one, two, and three are 5%, 3%, and 2%, respectively. The weighted average of the occupancy is 3.1.

Table 5.4: Meassured occupancy and distance from the center of the beam pipe of the SVD1.

| Layer                                        |      | 2    | 3    |
|----------------------------------------------|------|------|------|
| Radius (mm)                                  | 30.0 | 45.5 | 60.5 |
| Meassured occupancy $(\%)$                   | 5.0  | 3.0  | 2.0  |
| Number of VA1 chips treated by one processor | 5    | 6.25 | 8.75 |

We fit the measured occupancy as a function of layer radius, r,  $f(r) = a \times r^{-2}$ . From the fit result, we estimate the occupancy in each layer of the SVD2, as listed in Table 5.5. One PC processes one first layter ladder, two second layer ladders, three third layer ladders, and three forth layer ladders. The load of PC depends on the avarage occupancy weighted with the number of layers. The estimated average value is 2.5%.

Table 5.5: The occupancy estimation and distance from the center of beam pipe at the SVD2.

| Layer                                 |      | 2    | 3    | 4    |
|---------------------------------------|------|------|------|------|
| Radius (mm)                           |      | 43.5 | 70.0 | 88.0 |
| Estimated occupancy $(\%)$            | 12.2 | 2.6  | 1.0  | 0.6  |
| Number of Ladders processed in one PC |      | 2    | 3    | 3    |

## Chapter 6

## Conclusion

We have developed the PC-based data acquisition system (DAQ) for the upgraded Silicon Vertex Detector (SVD2) of the Belle experiment.

We utilize PCI boards (PPCIs) as an interface to receive the data from flash analog to digital converters (FADCs). We have studied the data receiving speed using the PPCIs. The data receiving speed is measured 24.8 MBbytes/sec/board, where three PPCIs are installed in one PC. This result satisfies the requirement of 6 Mbytes/sec/board. Then, we have performed the reliability test of the PPCI by checking the error rate of the PPCI in data transfer between two PPCIs. After 730 GBytes data are transferred, we find no change of the data. We also find neither data loss nor extra data at the boundary of each read block by the PPCI, which assures no fault in data transfer protocol from the FADC to the PPCI. We conclude that the PPCI has the high performance in the data receiving speed and the system is reliabile enough for the SVD DAQ.

We have developed the data reduction software for the PC farm on its data transfer stream from the FADC to the event builder. We have studied the performance of the data processing speed under the various occupancy levels. The software can catch up with 1.1 kHz trigger rate in case of the worst occupancy we assumed. The result satisfies the required trigger rate of 1 kHz.

We have performed 36-hour test of the data transfer stream from the FADC to the event builder to check the long-run stability of the PPCI and the data reduction software, and we have observed no significant change in pedestal and intrinsic noise distributions computed by the PC farm. We conculde that the our new DAQ system is stable to be used for long operation.

## Acknowledgement

I would like to express my heartfelt thanks to my Supervisor H. Aihara for his excellent advice.

I am specially thankful to M. Iwasaki. I express my appreciations to T. Higuchi, H. Ishino, Y. Ushiroda, Y. Yasu, J. Haba, M. Yokoyama, T. Kawasaki, Y. Igarashi, K. Hasuko, H. Tajima, J. Tanaka. I am grateful for the special support to T. Nakadaira, T. Tomura, T. Matsubara, H. Kawai, N. Uozaki, A. Kusaka, K. Tanabe and R. Ishida.

## Bibliography

- [1] J. H. Christenson, Cronin, Fitch and Turlay, Phys. Rev. 13 138 (1964).
- [2] M. Kobayashi and T. Maskawa, Prog. Theor. Phys. 49 652 (1973).
- [3] A.Carter and A.I.Sanda, Phys. Rev. Lett. 45 952 (1980); Phys. Rev. Lett. D23 1567 (1981); I.I.Bigi and A.I.Sanda, Nucl. Phys. 193 851 (1981)
- [4] K. Abe *et al.*, Phys. Rev. Lett. **87** 09802 (2001).
- [5] T. D. Lee and C. N. Yang, Phys. Rev. Lett. 98 1501 (1955); Phys. Rev. Lett. 104 254 (1956).
- [6] C. S. Wu *et al.*, Phys. Rev. Lett. **105** 1413 (1957).
- [7] D.E. Groom *et al.* (Particle Data Group), Eur. Phys. J. C15 1 (2000). Eur. Phys. J. C15, 1 (2000).
- [8] L. Wolfenstein, Phys. Rev. Lett. **51** 1945 (1983).
- [9] N. Cabibbo, Phys. Rev. Lett. **10** 531 (1963).
- [10] KEKB accelerator group, KEK B-Factory Design Report, KEK Report 95-7 (1995).
- Belle Collaboration, Technical Design Report, KEK Report 95-7 (1995); Belle Collaboration, BELLE Progress Report, KEK Report 96-1; Belle Collaboration, BELLE Progress Report, KEK Report 97-1.
- [12] A. Abashian *et al.*, KEK Progress Report 2000-4, submitted to Nucl. Instr. Meth.

- [13] Belle SVD group, Belle SVD Technical Design Report; G. Alimonti *et al.*, Nucl. Instr. Meth. A453 71 (2000); R. Abe *et al.*, IEEE Trans. Nucl. Sci. 48 997 (2001).
- [14] E. Nygård *et al.*, Nucl. Instr. Meth. A301 506 (1991); O. Toker *et al.*, Nucl. Instr. Meth. A340 572 (1994).
- [15] M. Yokoyama and H. Tajima, Belle note 196, unpublished.
- [16] M. Tanaka *et al.*, Nucl. Instr. Meth. **A432** 422 (1999).
- [17] Y. Ushiroda *et al.*, Nucl. Instr. Meth. **A438** 460 (1999).
- [18] Belle SVD group, Technical Design Report of Belle SVD2, unpublished.
- [19] Belle SVD group Vienna team, FADCTF system -Belle SVD 2.0 readout- user's manual, unpublished.
- [20] Fird Company, Shin-Toho building 4th flour, 3-8-2, Midomimachi, Fuchu-shi, Tokyo-to, Japan http://www2.ocn.ne.jp/~fird/
- [21] Fird Company, P-PCI hardware specifications, unpublished.

# Appendix A

## CP Violation in B Decays

### A.1 Introduction

Various symmetries play very important roles in particle physics. Some of them are continuous and the others are discrete. The CP symmetry is one of the latter and the origin of its violation is one of the most exciting mysteries in the present particle physics. As its name indicates, the CP transformation is a product of two discrete operations, C and P.

Charge conjugation, C, is a symmetry of the sign of particle charge. Parity, P, is a symmetry of space. P invariance means that the mirror image of an experiment yields the same result as the original.

Until 1956, it was believed that all elementary processes are invariant under C and P. Lee and Yang pointed out the possibility of the violation of these symmetries [5], and subsequent experiments [6] proved that C and P symmetries are really violated in weak interactions. However, the products of C and P transformations, CP was still a good symmetry.

The second impact came in 1964. An experiment using neutral K mesons showed that CP is also not conserved under weak interactions [1]. Neutral K mesons ( $K^0$ and  $\overline{K}^0$ ) are created by strong interactions. The mass eigenstates of the  $K^0 - \overline{K}^0$ system can be written as

$$|K_S\rangle = p|K^0\rangle + q|\overline{K}^0\rangle, |K_L\rangle = p|K^0\rangle - q|\overline{K}^0\rangle$$
(A.1)

(choosing the phase so that  $CP|K^0\rangle = |\overline{K}{}^0\rangle$ ). If the CP invariance held, we would have q = p so that  $K_S$  would be CP even and  $K_L$  would be CP odd. Because the kaon is the lightest strange meson, it decays through the weak interaction. Neutral kaons can decay into two or three pions. Since a pion has CP eigenvalue of -1,  $K_L$  always decays into three pions, if CP is conserved in weak interactions. The experiment performed at Brookhaven proved that a small faction of  $K_L$  decays into two pions, which means CP is violated in the weak interaction. In the kaon system, the order of observed CP asymmetry is  $10^{-3}$ .

## A.2 Cabibbo-Kobayashi-Maskawa Matrix

In 1973, M. Kobayashi and T. Maskawa proposed a theory of quark mixing which can introduce the CP asymmetry within the framework of the Standard Model [2]. They demonstrated that the quark mixing matrix with a measurable complex phase introduces CP violation into the interactions.

In the Standard Model, the quark-W boson interaction part of the Lagrangian is written as

$$L_{qW} = \frac{g}{\sqrt{2}} \{ \overline{u}_L \gamma_\mu W^+_\mu \mathbf{V} d_L + h.c. \},$$
(A.2)

where g is the weak coupling constant,  $u_L(d_L)$  represents the left-handed component of u-type (d-type) quarks, and V is the quark-mixing matrix.

If all the elements of the quark mixing matrix  $\mathbf{V}$  are real, the amplitudes for a certain interaction and that for the *CP* conjugate interaction are the same. In order to violate the *CP* symmetry,  $\mathbf{V}$  should have at least one complex phase as its parameter.

In general, N dimensional unitary matrix has  $N^2$  parameters, with N(N-1)/2 real rotation angles and N(N+1)/2 phases. Since we can rephase the quark fields except one relative phase, (2N-1) phases are absorbed and  $(N-1)^2$  physical parameters are left. Among them, N(N-1)/2 are real angles and (N-1)(N-2)/2 are phases. The presence of the phases means that some of the elements must be complex and this leads to CP violating transitions.

In the case of N = 2, two quark-lepton generations, there is one rotation angle (the Cabibbo angle) and no phase. This means that CP must be conserved in the model with four quarks.

In the case of three generations, N = 3, there are three rotation angles and one phase so that CP can be violated. The quark mixing matrix for the six-quark model can be written in many parameterizations, but two parameterizations are especially well known.

$$\mathbf{V} = \begin{pmatrix} V_{ud} & V_{us} & V_{ub} \\ V_{cd} & V_{cs} & V_{cb} \\ V_{td} & V_{ts} & V_{tb} \end{pmatrix}$$
(A.3)

$$= \begin{pmatrix} c_{12}c_{13} & s_{12}c_{13} & s_{13}e^{-i\delta_{13}} \\ -s_{12}c_{23} - c_{12}s_{23}s_{13}e^{i\delta_{13}} & c_{12}c_{23} - s_{12}s_{23}s_{13}e^{i\delta_{13}} & s_{13}c_{13} \\ s_{12}s_{23} - c_{12}c_{23}s_{13}e^{i\delta_{13}} & -c_{12}s_{23} - s_{12}c_{23}s_{13}e^{i\delta_{13}} & c_{23}c_{13} \end{pmatrix}$$
(A.4)

$$\simeq \begin{pmatrix} 1 - \frac{1}{2}\lambda^2 & \lambda & A\lambda^3(\rho - i\eta) \\ -\lambda & 1 - \frac{1}{2}\lambda^2 & A\lambda^2 \\ A\lambda^3(1 - \rho - i\eta) & -A\lambda^2 & 1 \end{pmatrix},$$
(A.5)

The first parameterization (A.4) is by Particle Data Group [7], where  $c_{ij} \equiv \cos \theta_{ij}$ and  $s_{ij} \equiv \sin \theta_{ij}$  for i, j = 1, 2, 3.

The second parameterization (A.5), originated by Wolfenstein [8], is also widely used. Setting  $\lambda$  to the sine of the Cabibbo angle [9],  $\sin \theta_C \simeq 0.22$ , and writing down all the elements in terms of powers of  $\lambda$ , the remaining three parameters are intended to be of order unity. It clearly indicates the hierarchy in the size of elements. The diagonal elements are almost unity. The elements between adjacent generations are smaller by an order of magnitude and the elements with the first and the third generations are further smaller. Experimentally, the parameters A and  $\lambda$  can be determined from tree-level decays and are rather well known [7]:

$$A = 0.84 \pm 0.04, \qquad \lambda = 0.2196 \pm 0.0023, \tag{A.6}$$

while  $\rho$  and  $\eta$  are not determined precisely, since their determination requires the measurement of  $V_{ub}$  and  $V_{td}$  which are of order  $\lambda^3$ .

The unitarity of the CKM matrix leads to some constraints on its elements. For example, the product between the first and the second columns lead to the equation,

$$V_{ud}V_{us}^* + V_{cd}V_{cs}^* + V_{td}V_{ts}^* = 0, (A.7)$$

which is related to the K meson system. Since the elements of the CKM matrix are complex, this implies they form triangles on a complex plane. Although the unitarity of the CKM matrix leads to six triangles, most of them have one side which is much shorter than the other two sides, and consequently one tiny angle. In the Wolfenstein parameterization, we can compare the magnitudes of three terms in equation (A.7):

$$O(\lambda) + O(\lambda) + O(\lambda^5) = 0.$$
(A.8)


Figure A.1: The unitarity triangle.

This explains why the observed CP asymmetries in K decays, related to the tiny angle, are very small  $(\mathcal{O}(10^{-3}))$ .

On the other hand, the B meson system is related to the following equation:

$$V_{ud}V_{ub}^* + V_{cd}V_{cb}^* + V_{td}V_{tb}^* = 0, (A.9)$$

where all the three terms are the same order of magnitude,  $O(\lambda^3)$ . This implies that all the three angles can be large in the triangle related to Equation (A.9), which leads to the possibility of large observable *CP* asymmetries in the *B* meson decays. The triangle related to *B* meson system (illustrated in Fig. A.1) is sometimes called the "unitarity triangle".

Since the only two generations are related to the tree diagrams of the K meson decays, the sensitivity to the parameters related to the CP violation is limited in the K system. In B meson system, all the angles  $\phi_1$ ,  $\phi_2$  and  $\phi_3$  can be measured independently, which leads to precise tests of the Standard Model.

## A.3 Measuring CP Asymmetry in B Meson Decays

B meson can be produced in two energy regions, a center-of-mass (CM) energy equal to or higher than the  $\Upsilon(4S)$  mass.

There are some advantages in producing B mesons at the  $\Upsilon(4S)$  energy region: The  $B\overline{B}$  cross section is the highest in the all CM energy;  $B\overline{B}$  pairs are exclusively



Figure A.2: The Feynman diagrams responsible for  $B^0 - \overline{B}{}^0$  mixing.

produced (50%  $B^0\overline{B}^0$  and 50%  $B^+B^-$ ); The energy of the produced B meson is known, which can be used to reduce the combinatorial background.

One of the most promising methods to measure the CP angles in the B meson system is based on neutral B decays to CP eigenstates  $f_{CP}$ , which are common to  $B^0$  and  $\overline{B}^0$ .  $B^0$  and  $\overline{B}^0$  can "mix" through the loop diagrams shown in Fig. A.2, i.e. after a certain time, a meson  $B^0$  at a production point is not a pure  $B^0$  state, but a mixed state of  $B^0$  and  $\overline{B}^0$ . The CP violation is induced by  $B^0-\overline{B}^0$  mixing through the interference of the two decay amplitudes of  $B^0$ ,  $A(B^0 \to f_{CP})$  and  $A(B^0 \to \overline{B}^0 \to f_{CP})$ . In order to detect this CP violation, one must know, or tag the flavor of the particle  $(B^0 \text{ or } \overline{B}^0)$  at a given time.

On the  $\Upsilon(4S)$ , tagging one B as a  $B^0$  or a  $\overline{B}^0$  identifies the other with certainty. Since both C and P eigenvalues of  $\Upsilon(4S)$  is -1 and the decay of  $\Upsilon(4S)$  is caused by strong interaction which conserves CP, the produced B- $\overline{B}$  should be in a CPeigenstate with eigenvalue of 1. Because the spin of  $\Upsilon(4S)$  is 1 and that of B is 0,  $B^0$ and  $\overline{B}^0$  mesons are produced with the orbital angular momentum of 1, which means the P eigenvalue of B- $\overline{B}$  system is -1. This restricts the C eigenvalue to be -1 and a  $B\overline{B}$  pair will remain in a coherent state as long as neither B has decayed. If one of them is detected to be  $B^0(\overline{B}^0)$  at a moment, the other is inevitably  $\overline{B}^0(B^0)$  at that time. This is extremely important for measuring the CP violation.

For example, consider one  $B^0$  from  $\Upsilon(4S)$  decays into semi-leptonic mode, like  $B \to D^* l \nu$   $(l = e \text{ or } \mu)$ , after  $t_1$  from its production. If that particle was  $B^0(=\overline{b}d)$  at  $t_1$ , the charge of the lepton is positive (see Fig. A.3) and if it was  $\overline{B}^0(=b\overline{d})$ , the charge of the lepton is negative. The flavor of B meson which decays into CP eigenstate is determined by the flavor of the associated B meson. There are mainly two methods to tag the flavor of B. One is based on semileptonic B decays and the other is the



Figure A.3: The Feynman diagrams of semi-leptonic  $B^0$  and  $\overline{B}^0$  decays.

charge of the kaon from  $b \to c \to s$  decay chain that also indicates the flavor of the B.

When a  $B^0-\overline{B}{}^0$  pair is produced with an odd relative angular momentum, the rate for one of the neutral B mesons to decay as  $\overline{B}{}^0$  at  $t = t_1$  and the other (which is  $B^0$ at  $t = t_1$ ) to decay into a CP eigenstate, for example  $J/\psi K_S$ , at  $t = t_2$  is written as

$$P\left(B^0 \to J/\psi K_S; \Delta t\right) = \frac{1}{2} e^{-\Gamma |\Delta t|} (1 + \lambda \sin \Delta m_d \Delta t), \qquad (A.10)$$

where  $\Gamma$  is the *B* meson total decay width,  $\Delta t \equiv t_2 - t_1$ ,  $\Delta m_d$  is the mass difference between the two weak eigenstates of neutral *B* mesons and  $\lambda$  is the *CP* asymmetry parameter

$$\lambda = -\sin(2\phi_1). \tag{A.11}$$

The CP conjugate of Eq. (A.10) is

$$P\left(\overline{B^0} \to J/\psi K_S; \Delta t\right) = \frac{1}{2} e^{-\Gamma |\Delta t|} (1 - \lambda \sin \Delta m_d \Delta t).$$
(A.12)

The value of  $\Delta t$  ranges from  $-\infty$  to  $+\infty$  and it is easily seen that the *CP* asymmetry vanishes in the time integrated rate. Therefore, the measurement of the decay time difference,  $\Delta t$  is required to observe the *CP* asymmetry in experiments at the  $\Upsilon(4S)$ .

In the normal  $e^+e^-$  colliders with identical energies of both  $e^+$  and  $e^-$ , the  $\Upsilon(4S)$  is produced at rest, and consequently the *B* mesons are produced at rest. Momenta of *B*'s from  $\Upsilon(4S)$  are about 325 MeV/*c* and the average decay length of the *B*'s is about 30  $\mu$ m, if the  $\Upsilon(4S)$  is produced at rest. In this case, it is impossible to measure the time dependence with the present vertex detectors.

A solution is to produce the  $\Upsilon(4S)$  moving in the laboratory frame. This can be achieved by colliding two beams of unequal energies. This results in two *B* mesons boosted in the same direction along the beam axis. The average distance between the two *B* decays is approximately  $\beta \gamma c \tau$  where  $\beta \gamma$  is the boost parameter of the center of mass and  $\tau$  is the average *B* lifetime. Since the *B* mesons move almost parallel to the beam axis, the decay time difference of two *B* mesons can be approximately calculated as

$$\Delta t \simeq \Delta z / \beta \gamma c, \tag{A.13}$$

where  $\Delta z$  is the distance between the decay vertices along the beam axis. A precise measurement of the decay vertices of the *B* mesons is necessary to measure the *CP* violation in this scheme.