Exposure to audiovisual input has been recognised as an engaging approach for implicit vocabulary learning. Recent studies have compared the effectiveness of different types of subtitling in enhancing viewers' learning gains in single word vocabulary, but research on multiword vocabulary , especially collocations, remains limited. The present study aims to examine which type of subtitling (i.e., English–Chinese subtitles or English-only subtitles) is more effective in L2 implicit collocation learning, and it attempts to identify Chinese L2 learners’ learning difficulties when acquiring collocations. Experiments and interviews were conducted with 66 Chinese undergraduates to investigate their acquisition, in terms of both form recognition and meaning recall. Results indicate that compared with English-only subtitles, the effectiveness of English–Chinese subtitles in enhancing the acquisition of L2 collocation meaning is more significant, although English-only subtitles can be more helpful in reinforcing collocation forms. The results of the interviews also suggest that participants are more likely to encounter difficulties with contextually marked phrasal verbs and idioms, and affective filters should be considered when analysing individual differences. The findings support the use of L1 in subtitles to improve the learning efficiency of vocabulary meaning, as well as to help teachers identify the sources of students’ learning difficulties.