PMID- 38442048
OWN - NLM
STAT- Publisher
LR  - 20240313
IS  - 2168-2208 (Electronic)
IS  - 2168-2194 (Linking)
VI  - PP
DP  - 2024 Mar 8
TI  - LA-ViT: A Network with Transformers Constrained by Learned-Parameter-Free 
      Attention for Interpretable Grading in a New Laryngeal Histopathology Image 
      Dataset.
LID - 10.1109/JBHI.2024.3373438 [doi]
AB  - Grading laryngeal squamous cell carcinoma (LSCC) based on histopathological 
      images is a clinically significant yet challenging task. However, more low-effect 
      background semantic information appeared in the feature maps, feature channels, 
      and class activation maps, which caused a serious impact on the accuracy and 
      interpretability of LSCC grading. While the traditional transformer block makes 
      extensive use of parameter attention, the model overlearns the low-effect 
      background semantic information, resulting in ineffectively reducing the 
      proportion of background semantics. Therefore, we propose an end-to-end network 
      with transformers constrained by learned-parameter-free attention (LA-ViT), which 
      improve the ability to learn high-effect target semantic information and reduce 
      the proportion of background semantics. Firstly, according to generalized linear 
      model and probabilistic, we demonstrate that learned-parameter-free attention 
      (LA) has a stronger ability to learn highly effective target semantic information 
      than parameter attention. Secondly, the first-type LA transformer block of LA-ViT 
      utilizes the feature map position subspace to realize the query. Then, it uses 
      the feature channel subspace to realize the key, and adopts the average 
      convergence to obtain a value. And those construct the LA mechanism. Thus, it 
      reduces the proportion of background semantics in the feature maps and feature 
      channels. Thirdly, the second-type LA transformer block of LA-ViT uses the model 
      probability matrix information and decision level weight information to realize 
      key and query, respectively. And those realize the LA mechanism. So, it reduces 
      the proportion of background semantics in class activation maps. Finally, we 
      build a new complex semantic LSCC pathology image dataset to address the problem, 
      which is less research on LSCC grading models because of lacking clinically 
      meaningful datasets. After extensive experiments, the whole metrics of LA-ViT 
      outperform those of other state-of-the-art methods, and the visualization maps 
      match better with the regions of interest in the pathologists' decision-making. 
      Moreover, the experimental results conducted on a public LSCC pathology image 
      dataset show that LA-ViT has superior generalization performance to that of other 
      state-of-the-art methods.
FAU - Huang, Pan
AU  - Huang P
FAU - Xiao, Hualiang
AU  - Xiao H
FAU - He, Peng
AU  - He P
FAU - Li, Chentao
AU  - Li C
FAU - Guo, Xiaodong
AU  - Guo X
FAU - Tian, Sukun
AU  - Tian S
FAU - Feng, Peng
AU  - Feng P
FAU - Chen, Hu
AU  - Chen H
FAU - Sun, Yuchun
AU  - Sun Y
FAU - Mercaldo, Francesco
AU  - Mercaldo F
FAU - Santone, Antonella
AU  - Santone A
FAU - Qin, Jing
AU  - Qin J
LA  - eng
PT  - Journal Article
DEP - 20240308
PL  - United States
TA  - IEEE J Biomed Health Inform
JT  - IEEE journal of biomedical and health informatics
JID - 101604520
SB  - IM
EDAT- 2024/03/05 18:42
MHDA- 2024/03/05 18:42
CRDT- 2024/03/05 12:54
PHST- 2024/03/05 18:42 [pubmed]
PHST- 2024/03/05 18:42 [medline]
PHST- 2024/03/05 12:54 [entrez]
AID - 10.1109/JBHI.2024.3373438 [doi]
PST - aheadofprint
SO  - IEEE J Biomed Health Inform. 2024 Mar 8;PP. doi: 10.1109/JBHI.2024.3373438.