#### 期刊菜单

Diagnostic Study of Acoustic Features in Parkinson’s Disease Based on Lasso-Logistic Regression
DOI: 10.12677/sa.2024.132025, PDF, HTML, XML, 下载: 36  浏览: 88

Abstract: In this paper, based on the acoustic profile data of 80 European subjects from the Regional Association of Parkinson’s Disease in Extremadura (Spain), a two-stage variable selection method was proposed to screen 44 acoustic profile factors in conjunction with lasso regression, and finally six significant acoustic profile factors were obtained: gender, Shim_loc, MFCC3, HNR35, PPE, and GNE. The above factors were combined to construct a column plot model of the risk of developing PD disease by multifactorial logistic regression to construct a column-line graph model of the risk of developing PD disease, and validate the validity and calibration of the model from multiple perspectives. The results showed that early PD patients had abnormal motor regulation of the basal ganglia, low values of MFCC3 and HNR35, high values of PPE, GNE and Shim_loc in the acoustic data, and reduced maximal frequency of vocal fold vibration during articulation and muffled voices, which indicated that the constructed columnar plot model could diagnose the risk of PD in the subjects according to the different acoustic features. In the future, acoustic features are expected to become important biomarkers for early PD diagnosis and provide an aid for remote screening of the disease.

1. 引言

2. 方法

2.1. 一般资料

Table 1. Key indicators of the acoustic characterisation dataset

2.2. 两阶段变量选择法

$d\left({x}_{k},{x}_{l}\right)=1-|\rho \left({x}_{k},{x}_{l}\right)|$ ，其中 $\rho \left({x}_{k},{x}_{l}\right)=\frac{{\sum }_{i=1}^{n}\left({x}_{ik}-{\stackrel{¯}{x}}_{k}\right)\left({x}_{il}-{\stackrel{¯}{x}}_{l}\right)}{\sqrt{{\sum }_{i=1}^{n}{\left({x}_{ik}-{\stackrel{¯}{x}}_{k}\right)}^{2}}\sqrt{{\sum }_{i=1}^{n}{\left({x}_{il}-{\stackrel{¯}{x}}_{l}\right)}^{2}}}$ (1)

$\delta \left({x}_{k}\right)={\sum }_{l\in {c}_{g}}d\left({x}_{k},{x}_{l}\right)$ (2)

${\mathrm{min}}_{\omega }\left({‖y-X\omega ‖}_{2}^{2}+\lambda {‖\omega ‖}_{1}\right),\text{\hspace{0.17em}}\lambda >0$ (3)

$\omega =\mathrm{arg}{\mathrm{min}}_{\omega }\left({‖y-X\omega ‖}_{2}^{2}+\lambda {‖\omega ‖}_{1}\right)={\left({X}^{\text{T}}X\right)}^{-1}\left({X}^{\text{T}}y-0.5\lambda I\right)$ (4)

$CV\left(\lambda \right)=\frac{1}{n}{\sum }_{j=1}^{n}MSE,j$ (5)

2.3. 建立多因素Logistic回归列线图模型

Logistic回归分析属于非线性回归，它是研究因变量为二项分类或多项分类结果与某些影响因素之间关系的一种多重回归分析方法 [5] 。在疾病的病因学研究中，经常需要分析疾病的发生与各危险因素之间的定量关系。Logistic回归模型较好地解决了因变量为二分类变量无法满足线性回归的基本假设条件问题，是医学研究，特别是流行病学病因研究中最常用的分析方法之一。在本题中，设因变量y是二分类变量(取值为1代表PD患者，0代表健康者)，不同的声学特征 ${x}_{1},{x}_{2},\cdots$ 可以作为生物标志物反映受试者患病与否从而影响y取值。假设某受试者具有m个自变量的声学特征，即在m个自变量作用下为PD患者发生的条件概率 $p=p\left(y=1|{x}_{1},{x}_{2},\cdots ,{x}_{m}\right)$ ，则logistic回归模型可表示为公式(6)，其中 ${\beta }_{0}$ 为常数项， ${\beta }_{1},\cdots ,{\beta }_{m}$ 为偏回归系数。

$p=\frac{\mathrm{exp}\left({\beta }_{0}+{\beta }_{1}{x}_{1}+\cdots +{\beta }_{m}{x}_{m}\right)}{1+\mathrm{exp}\left({\beta }_{0}+{\beta }_{1}{x}_{1}+\cdots +{\beta }_{m}{x}_{m}\right)}$ (6)

$\mathrm{ln}\frac{p}{1-p}={\beta }_{0}+{\beta }_{1}{x}_{1}+\cdots +{\beta }_{m}{x}_{m}$ (7)

3. 实证分析

3.1. 声学特征筛选结果

Table 2. Comparison of clinical data between the diseased and healthy groups

Figure 1. Path diagram of Lasso regression coefficients (left) with cross-validation (right)

3.2. Logistic回归列线图进行PD诊断

Figure 2. Diagnostic line diagram of PD diseases

Figure 3. Calibration curve

Figure 4. ROC curve (left) and Decision curve (right)

4. 讨论

1) 各组特征集不相关性 $d\left({x}_{k},{x}_{l}\right)$

2) 各组特征集热力图

 [1] Naranjo, L., Pérez, C.J., Campos-Roca, Y. and Martín, J. (2016) Addressing Voice Recording Replications for Parkinson’s Disease Detection. Expert Systems with Applications, 46, 286-292. https://doi.org/10.1016/j.eswa.2015.10.034 [2] Naranjo, L., Pérez, C.J., Martín, J. and Campos-Roca, Y. (2017) A Two-Stage Variable Selection and Classification Approach Forparkinson’s Disease Detection by Using Voice Recording Replications. Computer Methods and Programs in Biomedicine, 142, 147-156. https://doi.org/10.1016/j.cmpb.2017.02.019 [3] 马贵斌, 贺真伟, 王子德, 文洋, 李祥. 基于LASSO回归的胶质瘤早期鉴别诊断模型的构建和验证[J]. 重庆医学, 2023, 52(21): 3287-3291. [4] 陆春光, 葛梦亮, 宋磊, 吴继亮, 潘国兵. 基于Lasso-XGBoost-Stacking的省域电能替代潜力预测方法[J]. 浙江电力, 2023, 42(9): 9-15. [5] 何晓群, 刘文卿. 应用回归分析[M]. 第5版. 北京: 中国人民大学出版社, 2019: 240-243. [6] 黄茜, 郑少燕, 张志英, 朱丹萍, 范雪婷, 杜彪, 刘松坚. 基于Lasso回归构建生物标志物影响代谢综合征的风险预测模型[J]. 中国疗养医学, 2024, 33(1): 1-5. [7] 文鹏程, 张峪涵, 文贵华. 帕金森病患者的语音声学特征分析[J]. 听力学及言语疾病杂志, 2023(31): 1-4. [8] Rusz, J., Hlavnička, J., Tykalová, T., Bušková, J., Ulmanová, O., Ružička, E. and Šonka, K. (2016) Quantitative Assessment of Motor Speech Abnormalities in Idiopathic Rapid Eye Movement Sleep Behaviour Disorder. Sleep Medicine, 19, 141-147. https://doi.org/10.1016/j.sleep.2015.07.030