AIRE-Bench: Evaluating frontier AI R&D capabilities of language model agents against human experts Nov 25, 2024 userComment on RE-Bench: Evaluating frontier AI R&D capabilities of language model agents against human experts Authors: Hjalmar Wijk, Tao Lin, Joel Becker, Sami Jawhar, Neev Parikh, Thomas Broadley, Lawrence Chan,
AIVideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement Nov 25, 2024 userComment on VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement Authors: Daeun Lee, Jaehong Yoon, Jaemin Cho, Mohit Bansal Abstract: Recent text-to-video (T2V) diffusion models