During Submission How Do We Download Web Search URL?

Chris_Deotte · May 1, 2025, 9:46pm

The web search api docs here, say that we need to download the result URL ourselves.

Note: The Search APIs only return urls for images and webpages, instead of full contents. To get the full webpage contents and images, you will have to download it yourself. During the challenge, participants can assume that the connection to these urls are available.

During submission, do we need to do this? And, how do we do this? because internet is turned off. Can we use WGET on these websites?

Doesn’t this pose a risk that if a participant owns these websites, they can transfer all the test questions to their URL (using an http GET or POST) during submission and receive all the hidden test questions (then hardcode the answers into future submissions)?

Chris_Deotte · May 2, 2025, 7:11pm

One suggestion is that participants’ code cannot communicate directly with the internet (using WGET, etc). Instead they call an API provided by AIcrowd which fetches webpages. This ensures that participants’ code only receives information from the internet and cannot submit information (i.e. the hidden test questions) to external websites.

If this is how it currently works, how do we call the API to fetch webpages?

mohanty · May 2, 2025, 8:29pm

@Chris_Deotte : appreciate the suggestion. And acknowledge the current gap in being able to get the full page contents for the search queries.

We already have a safe approach deployed, where the search API responds on the eval servers with the full page contents directly in its response (so the time limit is also agnostic of the web requested related overheads).

It’s just not documented yet, and we hope to rollout some documentation about how to use it effectively over the weekend. We are aiming for an interface that works transparently as a drop in replacement for your local dev flow as well.

yilun_jin8 · May 13, 2025, 1:10pm

Hi, please checkout our announcements about the image fetcher and the web loader.