| 摘要: |
| 可重构结构因其高灵活性和高并行性的特点,已成为如长短期记忆(Long Short-term Memory,LSTM)网络等计算密集型应用的最佳选择之一。然而,随着参数和计算量的增加,带来存储和带宽的更高需求,严重限制了计算效率的提升。针对该问题,提出了一种面向可重构结构的LSTM混合压缩优化方法。基于LSTM网络在训练过程中对误差的敏感性,利用不同的压缩算法对LSTM网络进行压缩,并在压缩后再训练,分析模型精度恢复情况及收敛时间,将网络中的门控单元分为误差敏感组和误差不敏感组。使用Top-k剪枝(Top-k Pruning)策略和块循环矩阵变换策略分别对误差敏感组和误差不敏感组的门控单元进行压缩。最后,在基于Virtex UltraScale VU440 FPGA(Field Programmable Gate Array)开发板搭建的可重构阵列处理器上实现LSTM网络。结果表明,LSTM网络的压缩比达到了38.4,硬件加速比达到了1.41,精度损失约为1.7%,且硬件资源消耗也有一定减少。 |
| 关键词: 长短期记忆网络 可重构结构 模型压缩 |
| DOI:10.20079/j.issn.1001-893x.240926004 |
|
| 基金项目:科技创新2030“新一代人工智能”重大项目(2022ZD0119005) |
|
| LSTM Hybrid Compression Optimization Methodfor Reconfigurable Structures |
| WU Hai,JIANG Lin,b,LI Yuancheng |
| (a.College of Communication and Information Technology;b.College of Computer Science & Technology,Xi揳n University of Science and Technology,Xi揳n 710600,China) |
| Abstract: |
| The adaptable and parallel nature of reconfigurable structures has made them ideal for compute-intensive applications,including long short-term memory(LSTM) networks .With the increase of parameters and computation burden,however,it brings about higher demand of storage and bandwidth,which severely limits the computational efficiency.To tackle this problem,a hybrid compression optimization method for LSTM-oriented reconfigurable structures is proposed.Based on the sensitivity of the LSTM network to errors during training,the LSTM network is compressed using different compression algorithms and retrained after compression to analyze the recovery of model accuracy and convergence time,and the gating units in the network are classified into error-sensitive and error-insensitive groups.The gating units in the error-sensitive and error-insensitive groups are compressed using the Top-k pruning strategy and the block-cycle matrix transformation strategy,respectively.In conclusion,the LSTM network is implemented on a reconfigurable array processor built based on Virtex UltraScale VU440 field programmable gate array(FPGA) development board.As the results show,the LSTM network has achieved a compression ratio of 38.4,a hardware acceleration ratio of 1.41,an accuracy loss of about 1.7%,and a reduction in hardware resource consumption. |
| Key words: LSTM reconfigurable structure model compression |