Linyuan Gong
Linyuan Gong
Home
Publications
Light
Dark
Automatic
paper-conference
Evaluation of LLMs on Syntax-Aware Code Fill-in-the-Middle Tasks
We introduce Syntax-Aware Fill-In-the-Middle (SAFIM), a new benchmark for evaluating Large Language Models (LLMs) on the code Fill-in-the-Middle (FIM) task.
Linyuan Gong
,
Sida Wang
,
Mostafa Elhoushi
,
Alvin Cheung
PDF
Cite
Code
Dataset
Project
AST-T5: Structure-Aware Pretraining for Code Generation and Understanding
We introduce AST-T5, a novel pretraining paradigm that leverages the Abstract Syntax Tree (AST) for enhanced code generation, transpilation, and understanding.
Linyuan Gong
,
Mostafa Elhoushi
,
Alvin Cheung
PDF
Cite
Code
Model-Generated Pretraining Signals Improves Zero-Shot Generalization of Text-to-Text Transformers
This paper explores the effectiveness of model-generated signals in improving zero-shot generalization of text-to-text Transformers such as T5.
Linyuan Gong
,
Chenyan Xiong
,
Xiaodong Liu
,
Payal Bajaj
,
Yiqing Xie
,
Alvin Cheung
,
Jianfeng Gao
,
Xia Song
PDF
Cite
Code
Video
DOI
ADELT: Transpilation Between Deep Learning Frameworks
We propose a transpiler across deep learning frameworks using LLM and adversarial learning techniques.
Linyuan Gong
,
Jiayi Wang
,
Alvin Cheung
PDF
Cite
Joint Language Semantic and Structure Embedding for Knowledge Graph Completion
We train language models for knowledge graph completion tasks.
Jianhao Shen
,
Chenguang Wang
,
Linyuan Gong
,
Dawn Song
PDF
Cite
Code
PlotCoder: Hierarchical Decoding for Synthesizing Visualization Code in Programmatic Context
In this paper, we propose the new task of synthesizing visualization programs from a combination of natural language utterances and code context.
Xinyun Chen
,
Linyuan Gong
,
Alvin Cheung
,
Dawn Song
PDF
Cite
Code
Video
DOI
Anytime Sampling for Autoregressive Models via Ordered Autoencoding
We propose a new family of autoregressive model that enables anytime sampling.
Yilun Xu
,
Yang Song
,
Sahaj Garg
,
Linyuan Gong
,
Rui Shu
,
Aditya Grover
,
Stefano Ermon
PDF
Cite
Code
Slides
Improved Clinical Abbreviation Expansion via Non-Sense-Based Approaches
We propose two language model based approaches, including a novel length-agnostic permutation language model, find non-sense methods to be more effective than sense-based methods.
Juyong Kim
,
Linyuan Gong
,
Justin Khim
,
Jeremy C. Weiss
,
Pradeep Ravikumar
PDF
Cite
Code
Microsoft Research Asia's Systems for WMT19
Technical report for WMT19 News Translation competition, where we won the first place for 8 of the 11 translation directions.
Yingce Xia
,
Xu Tan
,
Fei Tian
,
Fei Gao
,
Weicong Chen
,
Yang Fan
,
Linyuan Gong
,
Yichong Leng
,
Renqian Luo
,
Yiren Wang
,
Lijun Wu
,
Jinhua Zhu
,
Tao Qin
,
Tie-Yan Liu
PDF
Cite
DOI
Efficient training of BERT by progressively stacking
We explore an efficient training method for BERT models.
Linyuan Gong
,
Di He
,
Zhuohan Li
,
Tao Qin
,
Liwei Wang
,
Tieyan Liu
PDF
Cite
Code
Cite
×