Abstract
The goal of program synthesis is to automatically generate programs in a
particular language from corresponding specifications, e.g. input-output
behavior. Many current approaches achieve impressive results after training on
randomly generated I/O examples in limited domain-specific languages (DSLs), as
with string transformations in RobustFill. However, we empirically discover
that applying test input generation techniques for languages with control flow
and rich input space causes deep networks to generalize poorly to certain data
distributions; to correct this, we propose a new methodology for controlling
and evaluating the bias of synthetic data distributions over both programs and
specifications. We demonstrate, using the Karel DSL and a small Calculator DSL,
that training deep networks on these distributions leads to improved
cross-distribution generalization performance.
Users
Please
log in to take part in the discussion (add own reviews or comments).