摘要: |
针对几种典型相似度算法在求解不完整时间序列相似度问题上准确率低、适应性差的问题,利用差分变换、量化处理、符号化处理、等价字符变换方法并借鉴最长公共子序列、贪婪字符串匹配算法优点,提出了一种适用于不完整时间序列的相似度求解算法。针对等长脉冲缺失时间序列,该算法的相似度结果经加权平均处理,准确率比典型算法提高了10%以上;而对于更具一般性的非等长脉冲缺失时间序列,相似度结果准确率也有明显提高。实验结果表明该算法对不完整时间序列具有较好的数据关联效果,在真实数据环境下具有较强的鲁棒性。 |
关键词: 时间序列 相似度算法 脉冲缺失 等价字符 数据关联 |
DOI: |
|
基金项目: |
|
A similarity algorithm for incomplete time series based on equivalent characters |
HUANG Chen,ZHANG Wei,LYU Min |
(Science and Technology on Electronic Information Control Laboratory,Chengdu 610036,China;Southwest China Institute of Electronic Technology,Chengdu 610036,China) |
Abstract: |
For the problems of low accuracy and poor adaptability of several typical similarity algorithms in solving the similarity problem of incomplete time series,a similarity algorithm suitable for incomplete time series is proposed by using the methods of difference transformation,quantitative processing,symbolic processing,and equivalent character transformation,as well as learning from the advantages of Longest Common Subsequence(LCS) and Greedy String Tiling(GST) algorithm.For equal length pulse missing time series,the accuracy of the algorithm is more than ten percent higher than that of the typical algorithms by using weighted average processing.For the more general unequal length pulse missing time series,the accuracy of similarity results is also significantly improved.The experimental results show that the algorithm has good data association effect on incomplete time series and strong robustness in real environment. |
Key words: time series similarity algorithm pulse missing equivalent character data association |