First, the methodological novelty is limited. The proposed framework mainly combines standard and widely used components, including CNN-based local feature extraction, Transformer-based sequence modeling, and a pretrained BART decoder.
Second, the comparison with prior work is not sufficiently clear. The paper lacks direct experimental comparisons with other relevant models. If prior EEG-to-text methods or related baselines can be adapted to the Chinese EEG dataset, they should be included as comparative baselines. If they cannot be fairly adapted, the paper should state this more explicitly and explain in detail why the proposed setting differs. A clearer positioning against prior work would better highlight the actual contribution of the submission.
First, the methodological novelty is limited. The proposed framework mainly combines standard and widely used components, including CNN-based local feature extraction, Transformer-based sequence modeling, and a pretrained BART decoder.
Second, the comparison with prior work is not sufficiently clear. The paper lacks direct experimental comparisons with other relevant models. If prior EEG-to-text methods or related baselines can be adapted to the Chinese EEG dataset, they should be included as comparative baselines. If they cannot be fairly adapted, the paper should state this more explicitly and explain in detail why the proposed setting differs. A clearer positioning against prior work would better highlight the actual contribution of the submission.