Publications

(2024). Evaluation of LLMs on Syntax-Aware Code Fill-in-the-Middle Tasks.

PDF Cite Code Dataset Project

(2024). AST-T5: Structure-Aware Pretraining for Code Generation and Understanding.

PDF Cite Code

(2023). Model-Generated Pretraining Signals Improves Zero-Shot Generalization of Text-to-Text Transformers. In ACL 2023.

PDF Cite Code Video DOI

(2023). ADELT: Transpilation Between Deep Learning Frameworks.

PDF Cite

(2022). Joint Language Semantic and Structure Embedding for Knowledge Graph Completion. In COLING 2022.

PDF Cite Code

(2021). PlotCoder: Hierarchical Decoding for Synthesizing Visualization Code in Programmatic Context. In ACL 2021.

PDF Cite Code Video DOI

(2021). Anytime Sampling for Autoregressive Models via Ordered Autoencoding. In ICLR 2021.

PDF Cite Code Slides

(2020). Improved Clinical Abbreviation Expansion via Non-Sense-Based Approaches. In ML4H (NeurIPS Workshop) 2020.

PDF Cite Code

(2020). MC-BERT: Efficient Language Pre-Training via a Meta Controller.

PDF Cite Code

(2019). Microsoft Research Asia's Systems for WMT19. In WMT19 (ACL 2019 Workshop).

PDF Cite DOI

(2019). Efficient training of BERT by progressively stacking. In ICML 2019.

PDF Cite Code