@misc{zhang2024autocoderover,
title={AutoCodeRover: Autonomous Program Improvement},
author={Yuntong Zhang and Haifeng Ruan and Zhiyu Fan and Abhik Roychoudhury},
year={2024},
eprint={2404.05427},
archivePrefix={arXiv},
primaryClass={cs.SE}
}
A SequenceInputStream represents the logical concatenation of other input streams. It starts out with an ordered collection of input streams and reads from the first one until end of file is reached, whereupon it reads from the second one, and so on, until end of file is reached on the last of the contained input streams.
pgloader will keep a separate file of rejected data, but continue trying to copy good data in your database.
pgloader also implements data reformatting, a typical example of that being the transformation of MySQL datestamps 0000-00-00 and 0000-00-00 00:00:00 to PostgreSQL NULL value
A very common workflow is to index some data based on its embeddings and then given a new query embedding retrieve the most similar examples with k-Nearest Neighbor search. For example, you can imagine embedding a large collection of papers by their abstracts and then given a new paper of interest retrieve the most similar papers to it.
TLDR in my experience it ~always works better to use an SVM instead of kNN, if you can afford the slight computational hit
["slug" being an entity attribute]
Spring Data offers an existsBy query method, which we can define in the PostRepository, as follows:
1
2
3
4
5
6
@Repository
public interface PostRepository
extends JpaRepository<Post, Long> {
boolean existsBySlug(String slug);
}
[another] option to emulate existence is using a CASE WHEN EXISTS native SQL query:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
@Repository
public interface PostRepository
extends JpaRepository<Post, Long> {
@Query(value = """
SELECT
CASE WHEN EXISTS (
SELECT 1
FROM post
WHERE slug = :slug
)
THEN 'true'
ELSE 'false'
END
""",
nativeQuery = true
)
boolean existsBySlugWithCase(@Param("slug") String slug);
}
@Repository
public interface PostRepository extends BaseJpaRepository<Post, Long> {
@Query("""
select p
from Post p
where date(p.createdOn) >= :sinceDate
"""
)
@QueryHints(
@QueryHint(name = AvailableHints.HINT_FETCH_SIZE, value = "25")
)
Stream<Post> streamByCreatedOnSince(@Param("sinceDate") LocalDate sinceDate);
}
The FETCH_SIZE JPA query hint is necessary for PostgreSQL and MySQL to instruct the JDBC Driver to prefetch at most 25 records. Otherwise, the PostgreSQL and MySQL JDBC Drivers would prefetch all the query results prior to traversing the underlying ResultSet.
B. Sacaleanu, and G. Neumann. Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC-2012), European Language Resources Association (ELRA), (2012)
M. Schuhmacher, and S. Ponzetto. Proceedings of the 7th ACM International Conference on Web Search and Data Mining, page 543--552. New York, NY, USA, ACM, (2014)
D. Pavlovic, P. Pepper, and D. Smith. Mathematics of Program Construction, volume 6120 of Lecture Notes in Computer Science, Springer Berlin Heidelberg, (2010)
M. Baroni, and R. Zamparelli. Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, page 1183--1193. Stroudsburg, PA, USA, Association for Computational Linguistics, (2010)
R. Khatchadourian, P. Greenwood, A. Rashid, and G. Xu. Proceedings of the 2009 IEEE/ACM International Conference on Automated Software Engineering, page 575--579. Washington, DC, USA, IEEE Computer Society, (2009)
C. Scholz, J. Illig, M. Atzmueller, and G. Stumme. Proceedings of the 25th ACM Conference on Hypertext and Social Media, page 279--284. Santiago, Chile, ACM, (September 2014)
C. Wang, J. Han, Y. Jia, J. Tang, D. Zhang, Y. Yu, and J. Guo. Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, page 203--212. New York, NY, USA, ACM, (2010)
J. Jardine, and S. Teufel. Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, page 501--510. Gothenburg, Sweden, Association for Computational Linguistics, (April 2014)
J. Illig, B. Roth, and D. Klakow. Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, volume 2: Short Papers, page 100--105. Gothenburg, Sweden, Association for Computational Linguistics, (April 2014)
D. Mollá, and B. Hutchinson. Proceedings of the EACL 2003 Workshop on EvaluationInitiatives in Natural Language Processing: are evaluation methods,metrics and resources reusable?, page 43--50. Association for Computational Linguistics, (2003)
Y. Chan, and D. Roth. Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1, page 551--560. Stroudsburg, PA, USA, Association for Computational Linguistics, (2011)
M. Banko, M. Cafarella, S. Soderland, M. Broadhead, and O. Etzioni. Proceedings of the 20th International Joint Conference on Artifical Intelligence, page 2670--2676. San Francisco, CA, USA, Morgan Kaufmann Publishers Inc., (2007)
H. Poon, and P. Domingos. Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1, page 1--10. Stroudsburg, PA, USA, Association for Computational Linguistics, (2009)
T. Joachims. Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, page 133--142. New York, NY, USA, ACM, (2002)
C. Danescu-Niculescu-Mizil, R. West, D. Jurafsky, J. Leskovec, and C. Potts. Proceedings of the 22nd international conference on World Wide Web, page 307--318. Republic and Canton of Geneva, Switzerland, International World Wide Web Conferences Steering Committee, (2013)
S. Cohen, K. Stratos, M. Collins, D. Foster, and L. Ungar. Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), page 223--231. Jeju Island, Korea, Association for Computational Linguistics, (July 2012)
A. Heydarnoori, K. Czarnecki, and T. Bartolomei. ECOOP 2009 – Object-Oriented Programming, volume 5653 of Lecture Notes in Computer Science, Springer Berlin Heidelberg, (2009)
M. Bruch, T. Schäfer, and M. Mezini. Proceedings of the 2006 OOPSLA workshop on eclipse technology eXchange, page 55--59. New York, NY, USA, ACM, (2006)
Q. He, B. Chen, J. Pei, B. Qiu, P. Mitra, and L. Giles. Proceedings of the 18th ACM conference on Information and knowledge management, page 957--966. New York, NY, USA, ACM, (2009)
B. Min, X. Li, R. Grishman, and S. Ang. Proceedings of the Fifth Text Analysis Conference (TAC 2012), National Institute of Standards and Technology (NIST), (November 2012)
A. Sun, R. Grishman, W. Xu, and B. Min. Proceedings of the Fourth Text Analysis Conference (TAC 2011), National Institute of Standards and Technology (NIST), (November 2011)
E. Alfonseca, K. Filippova, J. Delort, and G. Garrido. Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2, page 54--59. Stroudsburg, PA, USA, Association for Computational Linguistics, (2012)
I. Titov, and A. Klementiev. Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, page 1445--1455. Portland, Oregon, USA, Association for Computational Linguistics, (June 2011)
H. Poon, and P. Domingos. Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1, page 1--10. Stroudsburg, PA, USA, Association for Computational Linguistics, (2009)
J. Kang, K. Lerman, and L. Getoor. (2013)cite arxiv:1301.6277Comment: The 2013 International Conference on Social Computing, Behavioral-Cultural Modeling, & Prediction (SBP 2013).
R. Reichart, and A. Rappoport. Proceedings of the Thirteenth Conference on Computational Natural Language Learning, page 156--164. Stroudsburg, PA, USA, Association for Computational Linguistics, (2009)
D. Tikk, Z. Bánsághi, and G. Biró. Proc. of the 6th Int. Symp. of Hungarian Researchers on Computational Intelligence (HUCI 2005), page 267--276. Budapest, Hungary, (November 2005)
H. Ji, and R. Grishman. Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1, page 1148--1158. Stroudsburg, PA, USA, Association for Computational Linguistics, (2011)
D. Golland, J. DeNero, and J. Uszkoreit. Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), page 17--22. Jeju Island, Korea, Association for Computational Linguistics, (July 2012)
E. Ponvert, J. Baldridge, and K. Erk. Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1, page 1077--1086. Stroudsburg, PA, USA, Association for Computational Linguistics, (2011)
M. Mintz, S. Bills, R. Snow, and D. Jurafsky. Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2-Volume 2, page 1003--1011. Association for Computational Linguistics, (2009)
R. Bunescu, and R. Mooney. Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, page 724--731. Stroudsburg, PA, USA, Association for Computational Linguistics, (2005)
M. Zhang, J. Zhang, J. Su, and G. Zhou. Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, page 825--832. Stroudsburg, PA, USA, Association for Computational Linguistics, (2006)
M. Collins, and N. Duffy. Advances in Neural Information Processing Systems 14 --- Proceedings of the 2001 Neural Information Processing Systems Conference (NIPS 2001), December 3-8, 2001, Vancouver, British Columbia, Canada, page 625--632. MIT Press, Cambridge, MA, USA, (2002)
J. Eisner, and G. Satta. Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics, page 457--464. Stroudsburg, PA, USA, Association for Computational Linguistics, (1999)
S. Cohen, and N. Smith. Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, page 74--82. Stroudsburg, PA, USA, Association for Computational Linguistics, (2009)
B. Roth, G. Chrupala, M. Wiegand, M. Singh, and D. Klakow. Proceedings of the Fifth Text Analysis Conference (TAC 2012), Gaithersburg, Maryland, USA, National Institute of Standards and Technology (NIST), (November 2012)
S. Adafre, and M. de Rijke. Proceedings of the ACL Workshop on Feature Engineering for Machine Learning in Natural Language Processing, page 9--16. Stroudsburg, PA, USA, Association for Computational Linguistics, (2005)
C. Manning. Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part I, page 171--189. Berlin, Heidelberg, Springer-Verlag, (2011)
J. Reynar, and A. Ratnaparkhi. Proceedings of the fifth conference on Applied natural language processing, page 16--19. Stroudsburg, PA, USA, Association for Computational Linguistics, (1997)
J. Lafferty, A. McCallum, and F. Pereira. Proceedings of the Eighteenth International Conference on Machine Learning, page 282--289. San Francisco, CA, USA, Morgan Kaufmann Publishers Inc., (2001)
M. Collins. Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10, page 1--8. Stroudsburg, PA, USA, Association for Computational Linguistics, (2002)
X. Li, B. Liu, and S. Ng. Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, page 218--228. Stroudsburg, PA, USA, Association for Computational Linguistics, (2010)
B. Plank, and G. van Noord. Proceedings of the 2010 Workshop on NLP and Linguistics: Finding the Common Ground, page 25--33. Stroudsburg, PA, USA, Association for Computational Linguistics, (2010)
R. McDonald, and G. Satta. Proceedings of the 10th International Conference on Parsing Technologies, page 121--132. Stroudsburg, PA, USA, Association for Computational Linguistics, (2007)