ISSN- 2394-5125 VOL 07, ISSUE 09, 2020

## Improvised VLSI based DCT and DWT architectures for analyzing audio biological signals

Nidhi Joshi, Asst. Professor, SOC (School of computing), GEHU-Dehradun Campus DOI: 10.48047/jcr.07.09.591

#### **Abstract:**

The need for safe and effective medical image communication is growing. Both ends of the connection, the sending and receiving ends, must perform image compression before sending. There are a few different methods for compressing images, but the two most used are the Discrete Cosine Transform (DCT) and the Discrete Wavelet Transform (DWT). In the subject of Very Large Scale Integration (VLSI), logic devices are packed into eversmaller spaces. We find VLSI circuits in our computers, cars, and even the most cuttingedge digital cameras and mobile phones. When it comes to modeling and describing architectures, Hardware Description Language (HDL) is unfailingly correct. Different from conventional programming languages, both the semantics and the rules are radically novel. From the system level down to the gate level, Very High Speed Integrated Chip HDL (VHDL) provides a logical description of the structure and operation of digital systems that may be used as a model language for design and simulation. For VHDL-based circuit synthesis, implementation, and simulation, Xilinx's ISE suite, in conjunction with Modelsim, is the best Electronic Design Automation (EDA) tool you can use. All conceivable logical operations may be implemented by use of Field Programmable Gate Arrays (FPGAs). Many new logic design strategies have emerged as a consequence of the convergence of complementary Metal Oxide Semiconductor (CMOS) and Pass Transistor Logic (PTL) technologies. Eliminating design complexity while creating logic functions is made possible by the Gate Diffusion Input (GDI) approach.

#### Keywords: Discrete cosine Transform (DCT) and Discrete Wavelet Transform (DWT), Very High Speed Integrated Chip HDL (VHDL)

#### Introduction

Thanks to developments in Very Large Scale Integration (VLSI) technology, the use of integrated circuits in high-performance computing, the telecommunications sector, and consumer electronics has yielded significant benefits. Real-time applications in the sciences and engineering, including audio and picture compression, rely heavily on coding and decoding systems based on the Discrete Cosine Transform (DCT) and the Discrete Wavelet Transform (DWT). Improved processing speed and great reliability have made VLSI DCT processor chips essential for real time coding systems. Image compression, reconstruction, encoding, and fusion are just few of the many uses for DWT in the field of image processing. In addition to its widespread usage in signal processing, DWT has found applications in digital communication, optical electronics, and many others. Medical device size may be reduced thanks to the widespread usage of VLSI technology in the development and implementation of DWT. The medical system has advanced significantly

#### ISSN- 2394-5125 VOL 07, ISSUE 09, 2020

with the introduction of several health monitoring systems for keeping tabs on an individual's internal and exterior health factors, illness prediction, and symptom indication. More calculations are needed for this application due to the importance of DCT and DWT. High-performance, low-power implementations of the aforementioned algorithms have proven difficult to achieve in recent years. Many studies have recommended using digital circuits and contrasted their use in terms of size, power, and latency. Initial efforts in this project are focused on growing FPGA-based DCT and DWT designs. VHDL and Modelsim are used in the creation of this DWT design. The simulation-obtained Synthesis Report is useful for proceeding with the FPGA Implementation of DWT in the Virtex FPGA development kit. Parameters like region, delay, and power are included in the synthesis report. Transforms realized at the gate level, as a result of a hardware implementation. These findings encouraged further work using the TANNER software to design CMOS implementations of DCT and DWT. Finally, the suggested DWT and DCT architectures are used to identify symptomatic Pattern for audio biological signals and compared to the standard system. A discrete cosine transform (DCT) is a collection of frequency-specific data points represented as a sum of cosines. The Fast Fourier Transform (FFT) is used in fundamental mathematics and scientific computing to numerically solve differential equations, while the Discrete Cosine Transform (DCT) is used in engineering and research to compress audio and pictures in an irreversible manner. Some cosine functions call for a typical signal, and cosine boundary conditions are defined for differential equations, making their usage in compression far more important than that of the sine function. The coefficients of a DCT's Fourier series have a common relation to the symmetrically expanded sequence that repeats at regular intervals. In DCT, only integers are utilized. In contrast, DFT coefficients are solely connected to an extended periodic sequence. When doing DCTs on actual data with even symmetry, the length is about equivalent to two times the original data. The input/output data may be moved by half a byte in certain circumstances. DCT is a set of four widely accepted standards.

DCT, or discrete cosine transform, is an orthogonal transformation generally approved by multimedia standards and commonly used in picture compression. As one of 16 related trigonometric transformations, DCT is in good company. It's worth noting that computing the converted array using the above equation directly needs O(N4) operations. Since DCT uses quick computing processes and real arithmetic, it is widely used for picture compression applications. We offer an analog very large scale integration (VLSI) architecture and a digital architecture that can compute discrete cosine transforms. Analog and digital multipliers are used to multiply the input samples by all the DCT coefficients simultaneously, following the general definition of DCT. To conduct the required addition and subtraction for the analog circuit, a cross-point switch sends the multiplied values to separate integrators; in the digital domain, an adder and a subtractor are employed. In Figure 1 we see the signal flow diagram of 4 point DCT in action. The suggested design is easy to implement and well-suited to the reduction in silicon area and power that must be made, although at the expense of some precision. good performance, low power consumption, good noise immunity, and high density logic functions are the results of implementing DWTs and DCTs in complementary metal oxide semiconductors (CMOS). Computational Sharing Multiplier (CSHM) was first used to realize the FIR filter for the DWT design. Based on theoretical analysis, truncated multiplier seems to be more efficient and take up less space than CSHM. The modeling and implementation of a CMOS

ISSN- 2394-5125 VOL 07, ISSUE 09, 2020

realization of a DWT using a standard and suggested FIR filter based on a GDI are shown next. Tanner EDA tool simulation data are collected and compared for delay and area for both the traditional and suggested DWT architectures. The results of implementing a straightforward analog DCT design are given.

The DWT and DCT designs implemented in VHDL provide good results. Therefore, the DWT and DCT designs' CMOS implementation is prioritized so that the back end work may be accomplished. The Gate Diffusion Input (GDI) approach was employed for this realization. Modifying the CMOS inverter in this way yields a GDI cell that serves as the foundation for the actualization. Both the traditional and suggested transform designs are simulated and implemented in the same manner. The fundamental components outlined above are essential to the construction of the Finite Impulse Response (FIR) filter, which is necessary for the decomposition required in DWT. Therefore, this study also places a premium on the design of FIR filters. Computation Sharing Multiplier (CSM) and modified Carry Save Adder (CSA) are used for CMOS-based FIR implementation. Traditional and suggested DWT topologies are chosen using GDI because to the high transistor count required by FIR filters when implemented in CMOS technology. Finally, the suggested DWT and DCT architectures are used to identify symptomatic Pattern for audio biological signals and compared to the standard system. The simulation of this system is performed in Modelsim Altera 10.4. It has been shown via simulation that the suggested system requires less space and has less latency..

#### **Literature Review**

Adders and multipliers play crucial roles in the DCT. The limitations of PTL may be broken down into two categories. The first kind of issue is the threshold voltage drop across the channel of the transistor, which slows down operation and decreases the driving current at low voltages. Due to the low voltage requirements, this is useful for low power design. The second issue is that considerable static power dissipation would occur since V is not the high input voltage level of the regenerative inverters and the PMOS (Ptype Metal-Oxide-Semiconductor) transistor in the inverter is not switched off. The GDI (Gate-Diffusion Input) Method is an inverter that uses NMOS (N-type MetalOxide-Semiconductor) and PMOS (P-type MetalOxide-Semiconductor) but differs from CMOS (Complementary Metal-Oxide-Semiconductor) inverters in many key respects. Complex logic functions may be realized with only two transistors using the GDI method. This method is an alternative to CMOS and PTL for designing low-power, high-speed circuits using fewer transistors..

**Bahubalindruni et al. (2017)** new low-power analog circuit that can act as an adder operator was introduced. Under worst-case conditions (when adding four signals of the same frequency), the suggested circuit consumes 78 W of power while exhibiting a linearity error of less than 3.2% (up to an input signal peak-to-peak value of 2 V). Its linearity performance is better than that of the common drivers studied in this research.

**Ileskhan Kalysh et al. (2018)** analog multiplier with four variables is presented. The CMOS-memristor circuit is implemented. There are already a plethora of analog multipliers available that are built out of resistors and CMOS transistors. The circuit is modeled in SPICE for simulation purposes. The suggested circuit consumes less power and can process the data in a shorter amount of time. The two truncated multiplier designs for n

ISSN- 2394-5125 VOL 07, ISSUE 09, 2020

bit x n bit multiplication are the Constant correction approach and the Variable correction method. For a nn multiplication, a fixed-width multiplier only calculates the n Most Significant bits of the 2n-bit result. To lower the truncation errors, supplementary compensation and corrective circuits are used. Any of the aforementioned truncated multiplier designs may be utilized, with the choice being made based on the mistake caused in the least significant partial product (PP) bits. Truncated multipliers with faithful rounding precision, i.e. after truncation the highest absolute error is less than 1 unit of least position, are needed for a few applications in Digital Signal Processing, such as the construction of FIR filters.

**Moss et al. (2018)** Applications that might benefit from the increased performance of this new serial-parallel radix-4 multiplier include digital filters, neural networks with artificial intelligence, and additional machine learning techniques. This multiplier is a variant of the serial-parallel (SP) modified radix-4 Booth multiplier, which combines just the nonzero Booth encodings and ignores the zero operations to improve latency

## Design of DCT Architecture Using RCA BEC Adder and Truncation Multiplier Circuits

HDL design is graphically shown in Figure 1. It describes the multipliers and adders in depth. The relationship between the different components is also shown. Figure 1 displays



the results of a Modelsim simulation of the VHDL code for the intended DCT.

Figure 1 Simulation of DCT using RCA Adders and Multipliers

The results of a DCT simulation are shown in Figure.1. The clock and reset signals are the first two signals shown in this range. Sine and cosine waves are the next two signals. The sine wave is represented by a real number, whereas the cosine wave is represented by an imaginary number. The time-domain representations of the DCT output

ISSN- 2394-5125 VOL 07, ISSUE 09, 2020

signal are shown in the following two signals. The following signals represent the results of successive sub modules. Data-re and Data-in are combined into one signal input for DCT. While Data-in is in cosine form, Data-re is a sinusoidal signal. The DCT design produces two signals: data-out-re and data-outin.

Using the VHDL synthesis report, we can compare the performance of the standard DCT with that of the proposed DCT. When comparing outcomes,.

| XILINX FPGA<br>S6LX9-2FG144 | XOR MUX Truncation<br>DCT | RCA-BEC<br>Truncation based<br>DCT |
|-----------------------------|---------------------------|------------------------------------|
| Slice Registers             | 294                       | 277                                |
| LUT                         | 2162                      | 3551                               |
| Occupied Slice              | 711                       | 1245                               |
| IOB                         | 34                        | 34                                 |
| Delay(ns)                   | 28.118                    | 94.367                             |
| Power(mW)                   | 14                        | 14                                 |

#### Table 2.Comparison of DCT

Table 2 shows that compared to a traditional design, the suggested architecture uses fewer LUTs, fewer occupied slices, and 66.25 ns less latency.

#### DWT

The following findings are acquired for both the traditional DWT and the proposed DWT from the synthesis report generated for the DWT design using VHDL. When comparing outcomes,.

| XILINX FPGA<br>S6LX9-2FG144 | XOR MUX<br>Truncation DWT | RCA-BEC Truncation<br>based DWT |  |
|-----------------------------|---------------------------|---------------------------------|--|
| Slice Registers             | 696                       | 1001                            |  |
| LUT                         | 1014                      | 1769                            |  |
| Occupied Slice              | 325                       | 592                             |  |
| IOB                         | 58                        | 58                              |  |
| Delay(ns)                   | 8.981                     | 28.453                          |  |
| Power(mW)                   | 14                        | 14                              |  |

ISSN- 2394-5125 VOL 07, ISSUE 09, 2020

Table 3 shows that the suggested design uses fewer LUTs, fewer occupied slices, and a shorter delay than the baseline design by 13%, 19%, and 19.55ns, respectively. However, the suggested designs have the same power requirements as the standard ones.

#### Audio Biological System

DWT and DCT are used to create the synthesis report for the audio biological testing environment. Which is used to compile the following comparison table.

| Audio Biological System       |                       |                           |  |  |  |  |
|-------------------------------|-----------------------|---------------------------|--|--|--|--|
| XILINX FPGA<br>S6LX16-2CSG324 | XOR MUX<br>Truncation | <b>RCA-BEC</b> Truncation |  |  |  |  |
| Slice Registers               | 2829                  | 2884                      |  |  |  |  |
| LUT                           | 4582                  | 6911                      |  |  |  |  |
| Occupied Slice                | 1613                  | 2101                      |  |  |  |  |
| IOB                           | 15                    | 15                        |  |  |  |  |
| Delay(ns)                     | 30.125                | 95.513                    |  |  |  |  |
| Power(mW)                     | 21                    | 21                        |  |  |  |  |

### Table 4 Comparison of Audio Biological System

According to Table4, the suggested architecture uses 25% fewer LUTs, 22% fewer occupied slices, and 65.39ns less latency than the standard approach. However, conventional and suggested designs have identical power requirements.

## **BACK END RESULTS**

Back-end processing involves transforming gate-level design analysis into transistor-level modeling. The FPGA's put and route tool provides the foundry with guidance on how to connect the many pre-existing flip-flops and gates on the device.

**Table 5 Analog implementation** 

| Sl.No | Туре            | Delay (ps) | Power (µw) | Power Delay<br>Product(nJ) |
|-------|-----------------|------------|------------|----------------------------|
| 1     | Adder/Subtrator | 2915       | 7.231      | 21                         |
| 2     | Multiplier      | 17365      | 135.721    | 2356                       |
| 3     | 4 Point DCT     | 137079     | 968.125    | 132709                     |

Table 5 displays the synthesized outcome of digital implementation. In addition, we

ISSN- 2394-5125 VOL 07, ISSUE 09, 2020

can see that the digital adder uses 14.330 W, whereas the multiplier uses 546.811 W. Digital adder implementation uses around 8.9 percent and DCT implementation uses about 84 percent as much electricity. In order to perform DCT, a digital adder requires 3845.9 W of power.



## Figure 2 Comparison of delay for analog and digital implementation of DCT

Figures 2, 3 and 4 present the comparison between analog and digital implementation.



Figure 3.Power Comparison of analog and digital implementation of DCT

ISSN- 2394-5125 VOL 07, ISSUE 09, 2020



# Figure 4. Comparison of power delay product for analog and digital implementation of DCT

When compared to a digital version, the speeds of an analog adder/subtractor and a 4-point DCT are 1.4% and 2.2% quicker, respectively. To compare, a digital adder/subtractor uses roughly half as much power as an analog one, while a 4-point DCT uses about a quarter as much. The results of these tests show that analog four-point DCT implementations are superior to their digital counterparts.

#### Conclusion

In this study, we present a DCT- and DWT-based audio watermarking technique. Our experiments validated the resilience of the proposed watermarking system against standard signal processing procedures. In addition, the suggested system accomplishes high SNR with low error probability rates. We have evaluated our method with state-of-the-art competitors in the field of audio watermarking. When tested against common assaults including noise insertion, resampling, requantization, cropping, low-pass filtering, and MP3 compression, our suggested approach performed very well. Some attacks, including as re-sampling, low pass filtering, Amplitude scaling, Pitch shifting, and MP3 Compression, are vulnerable to the suggested approach. We plan to address this issue in the near future.

#### ISSN- 2394-5125 VOL 07, ISSUE 09, 2020

#### References

- Katreepalli, R & Haniotakis, T (2017), 'High Speed Power Efficient Carry Select Adder Design', proceedings of the IEEE Computer Society Annual Symposium on VLSI (IS VLSI-2017), pp. 32-37.
- Moss, DJ, Boland, D & Leong, PH (2018), 'A Two-Speed, Radix-4, Serial-Parallel ', IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 27, no.4, pp. 769-77.
- Liu, W, Qian, L, Wang, C, Jiang, H, Han, J & Lombardi, F (2017), 'Design of approximate radix-4 booth s for error-tolerant computing', IEEE Transactions on Computers, vol. 66, no. 8, pp. 1435-1441.
- Cui, X, Dong, W, Liu, W, Swartzlander, EE & Lombardi, F(2017), 'High Performance Parallel Decimal s Using Hybrid BCD Codes', IEEE Transactions on Computers, vol. 66, no. 12, pp. 1994-2004.
- Bahubalindruni, PG, Tavares, VG, Martins, R, Fortunato, E & Barquinha, P (2017) 'A low-power analog adder and driver using a-IGZO TFTs', IEEE Transactions on Circuits and Systems I: Regular Papers', vol. 64, no. 5, pp. 1118-1128.
- Kalysh, I, Krestinskaya, O & James, AP (2018), 'CMOS-Memristive Analog Design', proceeding of the 2018 International Conference on Computing and Network Communications (CoCoNet), pp.1-5
- Xing, Y, Zhang, Z, Qian, Y, Li, Q & He, Y (2018), 'An Energy-Efficient Approximate DCT for Wireless Capsule Endoscopy Application', proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS), pp. 1-4.
- Mopuri, S, Vanjari, SR & Acharyya, A (2018), 'Low-Complexity and Reconfigurable Discrete Hilbert Transform Architecture Design Methodology', Journal of Low Power Electronics, vol. 14, no. 2, pp. 327- 336.
- Sabeetha, S, Ajayan, J, Shriram, S, Vivek, K & Rajesh, V 2015, 'A study of performance comparison of digital s using 22nm strained silicon technology', proceedings of the 2nd International Conference on Electronics and Communication Systems (ICECS), pp. 180-184.
- 10. Thiruveni, M & Deivakani, M (20120, 'Design of Analog VLSI Architecture for DCT',

#### ISSN- 2394-5125 VOL 07, ISSUE 09, 2020

International Journal of Engineering and Technology, vol. 2, no. 8, pp. 1475-1481.

- Fox, TW & Turner, LE(2002), 'Implementing the discrete cosine transform using the Xilinx Virtex FPGA', proceeding of the International Conference on Field Programmable Logic and Applications, ed. Glesner M., Zipf P., Renovell M. Springer, Berlin, Heidelberg, pp. 492-502.
- Ang, BH, Sheikh, UU & Marsono, MN (2015), '2-D DWT system architecture for image compression'. Journal of Signal Processing Systems, vol. 78, no. 2, pp. 131-137.
- Darji, A, Agrawal, S, Oza, A, Sinha, V, Verma, A, Merchant, SN & Chandorkar, AN (2014), 'Dual-scan parallel flipping architecture for a lifting-based 2-D discrete wavelet transform', IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 61, no.6, pp. 433-437.
- Mohanty, BK & Meher, PK (2016), 'A high-performance FIR filter architecture for fixed and reconfigurable applications', IEEE transactions on very large scale integration (VLSI) systems, vol. 24, no. 2, pp. 444- 452.
- 15. Morgenshtein, A, Fish, A & Wagner, IA (2002), 'Gate-diffusion input (GDI): a powerefficient method for digital combinatorial circuits', IEEE transactions on very large scale integration (VLSI) systems, vol. 10, no. 5, pp. 566-581.
- 16. Penchalaiah Uand Siva kumar VG (2018) 'Survey: Performance Analysis of FIR Filter Design Using Modified Truncation with SQRT based carry Select Adder', International Journal of Engineering and Technology, vol. 7, no.2, pp.23-34.