Cannot get response headers #512
Replies: 4 comments
-
|
@ejkitchen Could you please share the URL you're trying? It works on my side, as seen in this image, and shouldn't require any flag by default; maybe it's dirt. Sharing the URL will help me check it better. |
Beta Was this translation helpful? Give feedback.
-
|
Thanks for the response! I decided to go back to basics as I could not get this to work. We had about 400k URLs and I sampled about 15 at random (it's a public website) and nothing, no errors in logs etc. Everything else was coming through but not the headers. Using the basic python libs, they come through without issue. I no longer have the code since we moved away from the lib but if you give me your snippet here, I can try again and if I get an issue, will let you know ASAP and what links. |
Beta Was this translation helpful? Give feedback.
-
|
Sure, for examples: import asyncio
from crawl4ai import AsyncWebCrawler
async def main():
async with AsyncWebCrawler(headless=True) as crawler:
result = await crawler.arun(
url="https://en.wikipedia.org/wiki/apple",
bypass_cache=True,
)
print(result.response_headers)
if __name__ == "__main__":
asyncio.run(main())This generate the below log for me: |
Beta Was this translation helpful? Give feedback.
-
|
@ejkitchen I've figured out the issue. You were right - it occurs when |
Beta Was this translation helpful? Give feedback.

Uh oh!
There was an error while loading. Please reload this page.
-
When I do this
result = await crawler.arun(url=url, bypass_cache=False)
result.response_headers is always {}
It appears here:
62 async def arun(
....
131: crawl_result.response_headers = async_response.response_headers if async_response else {}
It appears that async_response is always None. Is there a way to get the headers? Am I missing a flag? I have never been able to get the headers and this is baffling to me.
Beta Was this translation helpful? Give feedback.
All reactions