“A simplified asynchronous data communication system is shown in Figure 1. The receiver side extracts the clock signal Clk1 from the received bit stream from the serial link as its working clock source; while the transmitter side uses the clock Clk2 generated by the local crystal oscillator and the phase-locked loop as its working clock source. The receiver writes data into the elastic buffer on the rising edge of the clock Clk1, and the transmitter reads data from the elastic buffer on the rising edge of the clock Clk2, thereby realizing data synchronization.
A simplified asynchronous data communication system is shown in Figure 1. The receiver side extracts the clock signal Clk1 from the received bit stream from the serial link as its working clock source; while the transmitter side uses the clock Clk2 generated by the local crystal oscillator and the phase-locked loop as its working clock source. The receiver writes data into the elastic buffer on the rising edge of the clock Clk1, and the transmitter reads data from the elastic buffer on the rising edge of the clock Clk2, thereby realizing data synchronization.
Although all communication devices in the Fibre Channel arbitration loop must work at the same frequency, the clock signals Clk1 and Clk2 from the two different sources in Figure 1 may differ in phase. Due to the manufacturing process, the crystal oscillator generates a clock. The frequency is also allowed to have a certain error. This error range is ±100×10-6, that is, a deviation of ±100 clock cycles is allowed within the time of every million ideal clock cycles. When two different crystal oscillators generate clocks of the same frequency, the maximum possible error between them is 200×10-6. Therefore, for two clocks of the same frequency generated by different crystal oscillators, in addition to the difference in phase, in the worst case, after 106/200=5,000 cycles, there will be a clock cycle offset between them. . For continuous data streams, due to the limited size of the elastic buffer used for clock synchronization, if the offset of this clock cycle cannot be handled correctly, it will cause buffer overflow, damage valid data, and seriously affect the performance of the system.
1. The basic principles of FC-AL elastic cache management
In the FC-AL communication system, elastic buffering is also used to solve the problem of data synchronization in different clock domains, and filler words are added or deleted to the buffer in a timely manner by managing the elastic buffer to control the number of effective transmission words in the buffer ( The storage unit of the elastic buffer in this design is words), so as to achieve compensation for clock skew. Filling words are a special type of transmission words defined in the FC-AL protocol. They are transmitted in the gap between frames and outside the frame delimiter. Therefore, the management of the elastic buffer to appropriately add or delete these special transmission words will not damage the data frame or affect the normal operation of the ring network. When to add or delete fillers to the elastic cache is determined by the occupancy rate of the cache. The usage status of the flexible cache space is divided into 4 levels: add filler waiting, hold status, low-level delete filler waiting, and high-level delete filler waiting. When the write clock of the elastic cache is slower than the read clock (as shown in Figure 1, the frequency of Clk1 is slightly lower than the frequency of Clk2), the cache may be emptied and misread. At this time, it is necessary to add filler words, which is equivalent to increasing the available cache. The amount of read data prevents the cache from underflowing; when the write clock of the elastic cache is faster than the read clock (as shown in Figure 1, the frequency of Clk1 is slightly higher than the frequency of Clk2), the cache may be overwritten with incorrect data. It is necessary to delete the filler words to increase the available space in the cache to prevent the cache from overflowing.
The basic principle of elastic cache management is shown in Figure 2. Assuming that the depth of the elastic cache is 4, each small cell marked as 0 or 1 in the figure represents a storage space of the elastic cache. Set to 1, indicating that the corresponding storage space has been written with valid data and has not been read; set to 0, indicating that the corresponding storage space has not been written, or the written data has been read.
Since the read operation of the cache must be started after data has been written into the cache, it is assumed that the read operation starts when two spaces in the cache are written. Therefore, for the subsequent elastic cache management, when exactly 2 spaces in the cache are occupied, it is in the holding state and performs normal read and write operations; when more than 2 spaces in the cache are occupied, it is in the deletion of filler words The waiting state indicates that the frequency of the write clock is higher than the frequency of the read clock, and the operation of deleting filler words is required; when less than 2 spaces in the cache are occupied, it is in the state of waiting for adding filler words, indicating that the frequency of the write clock is lower than The frequency of the read clock requires the operation of adding filler words.
2. Hardware circuit design
The key to implementing elastic buffering with asynchronous FIFO is to monitor the occupancy rate of the buffer space to determine the possible small differences in the read and write clocks, predict whether the elastic buffer may be read or full, and decide when to add filler words Or delete operation, and what level of delete operation, and ensure that the subsequent data read and write will not be affected after the add or delete operation. It should be noted that the operations of adding or deleting filler words must be performed in the read clock domain.
In asynchronous data communication systems, there are two problems in using elastic buffers to achieve data synchronization between multiple clock domains-data delay and buffer size. Data latency refers to the time difference between when data is written to the cache and read from the cache. Assuming that the size of the elastic cache is N, the current data is written to the Nth storage space of the cache under the premise that there is no data overwriting, and at this time there are still N-1 spaces in the cache that have not yet been read out. , At least in the read clock domain, the data currently being written needs to be read out after a delay of at least N-1 read clock cycles. It can be seen that the larger the cache space, the greater the delay of the data that has been cached. However, in order to prevent the buffer from being full or empty and the operation of adding or deleting fillers cannot be executed in time and causing the buffer to overflow, a larger buffer space must be set to provide a sufficient time range for buffer management, thereby reducing Possibility of buffer overflow.
In order to obtain the smallest possible data delay without affecting the normal transmission of data, when the buffer size meets the basic requirements of the system, how to more accurately determine the occupancy rate of the elastic buffer space becomes very important. In order to improve the accuracy of cache management, the elastic cache design method adopted in this article is shown in Figure 3. At the rising edge of the write clock, data is written to the output of the write address generation logic based on the rising edge of the write clock, that is, the flexible buffer space pointed to by the write pointer; at the rising edge of the read clock, logic is generated based on the read address on the rising edge of the read clock The output of is the data read out of the flexible buffer space pointed to by the read pointer. In addition, a read and write address generation logic based on the falling edge of the clock is also set up, but they do not affect the read and write operations of the elastic cache. Asynchronize the read address based on the rising edge of the read clock and the delayed write address based on the rising edge of the write clock, as well as the read address based on the falling edge of the read clock and the delayed write address based on the falling edge of the write clock. compare. Combining the two comparison results, determine the change in the elastic cache space occupancy rate due to possible differences between read and write clocks of the same frequency but different sources. The judged asynchronous signal is synchronized to the read clock domain through a synchronous logic, and the output of the read address based on the rising edge of the read clock is controlled to generate the output of the logic, so as to realize the addition or deletion of filler words in the elastic buffer, and achieve the purpose of preventing buffer overflow .
3. Analysis of simulation results
Use Verilog language to realize the RTL description of the circuit shown in Figure 3, and simulate it with ModelSim, the simulation results are shown in Figure 4 and Figure 5. In the two figures, CLK_rcv and CLK_local are respectively the write clock and read clock of the cache with very similar frequencies.
In Figure 4, the clock frequency of CLK_rcv is slightly lower than the clock frequency of CLK_local, and the elastic buffer may be emptied.
When CLK_local is about half a clock cycle longer than CLK_rcv, the cache management issues a request for adding fillers to perform the addition operation in the nearest frame gap.
In Figure 5, the clock frequency of CLK_rcv is slightly higher than the clock frequency of CLK_local, and the elastic buffer may be full. When CLK_local is less than CLK_rcv by about half a clock cycle, a lower-level request to delete filler words is issued to perform the delete operation in the nearest gap. If the lower-level delete operation is not executed in time, causing the cache space occupancy to further increase, a higher-level delete operation is requested.
It can be seen from Figure 4 and Figure 5 that adding a filler means not reading the data in the buffer space in the current clock cycle, but sending a current filler; deleting a filler means skipping the current read if the conditions are met. Address space, directly read the data in the next address space.
The elastic cache design method proposed in this paper makes full use of the characteristics of the Fibre Channel protocol, reduces the maximum possible delay of data in the elastic cache by improving the management accuracy of the elastic cache, and helps to improve the overall performance of the arbitrated ring network.
The Links: NL2432HC22-44B LQ10D36C IGBT-SUPPLIER