[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Access received headers from socket.http in storing sink
- From: "J.Jørgen von Bargen" <jjvb.primus@...>
- Date: Sun, 10 Jan 2010 08:25:10 +0100
Hi
I wanted to show some progress when downloading big files with
socket.http, had some problems and made some changes.
The main issue, I've had, is
* How to access the received headers from within the storing sink?
Main reason is, I need the "content-length" from the reply to calculate
progress.
Other reasons could be the "filename" header, which you will need, when
the filename is not detectable from the url.
My solution (*please comment, if this is ok or if there is a better way*):
I have added two lines in socket.http in function trequest(reqt)
headers = h:receiveheaders()
+ reqt.reply={}
+ reqt.reply.headers=headers
-- at this point we should have a honest reply from the server
Now I have the possibility to write a receiving function with progress
display.
Since my request table is passed all through the http layers, while
being still
the same table, modifications to the request (as made above) can be used
in the
sink (as shown below)
--8<--snip--8<--snip--8<--snip--8<--snip--8<--snip--8<--snip--8<--snip--8<--snip
------------------------------------------------------------------------
--- get one url save into a file
-- @param url to get
-- @param file to save
------------------------------------------------------------------------
function get_url_save_long_file(url,file)
printf("Retrieving %s\n",url)
local request=url
if type(url)=="string" then
request={url=url}
end
local fd,err=io.open(file,"wb")
if not fd then
Error("open(%s)failed(%s)\n",file,err)
return nil
end
local want
local have=0
local p1=io.stdout:seek()
local t0=socket.gettime()
---------------------------------
-- the receiving filter
---------------------------------
local function sink_fd(chunk, src_err)
if chunk == nil then
-- no more data to process, we won't receive more chunks
fd:close()
if src_err then
printf("\n ==> Src_Error=%s\n",src_err)
return nil,src_err
-- source reports an error, TBD what to do with chunk
received up to now
else
printf("\n ==> EOF %s\n",dots(have))
return true -- or anything that evaluates to true
end
elseif chunk == "" then
printf("\n ==> ''\n")
-- this is assumed to be without effect on the sink, but may
-- not be if something different than raw text is processed
-- do nothing and return true to keep filters happy
return true -- or anything that evaluates to true
else
-- try to get expected length
if have==0 then
-- this is where I access the header
local h=request.reply and request.reply.headers
want=h["content-length"]
end
local size=#chunk
local elapsed=socket.gettime()-t0
have=have+size
if p1 then
io.stdout:seek("set",p1)
end
if want then
local kbs=0.001*have/elapsed
local total=elapsed*want/have
local remain=total-elapsed
local time_for_this=elapsed*size/have
printf(" ==>%d %8s/%8s %6.2fkbs (%s/%s rem %s)%s
\r",size,dots(have),dots(want),kbs,t2s(elapsed),t2s(total),t2s(remain),t2s(time_for_this))
else
printf(" ==> %8s (%s) \r",dots(have),t2s(elapsed))
end
-- chunk has data, process/store it as appropriate
fd:write(chunk)
return true -- or anything that evaluates to true
end
-- in case of error
return nil, err
end
request.sink=sink_fd
local ret,sts=http.request(request)
printf("Retrieved \"%s\" = ret=%s,sts=%s\n",url,vis(ret),vis(sts))
return sts
end
-->8--end-->8--end-->8--end-->8--end-->8--end-->8--end-->8--end-->8--end
This worked very fine for me, until I met a redirecting website. My sink
only got the header from the redirect-reply, not for the real data one.
So I made another change (comments again are welcome)
This is the original
function tredirect(reqt, location)
local result, code, headers, status = trequest {
-- the RFC says the redirect URL has to be absolute, but some
-- servers do not respect that
url = url.absolute(reqt.url, location),
source = reqt.source,
sink = reqt.sink,
headers = reqt.headers,
proxy = reqt.proxy,
nredirects = (reqt.nredirects or 0) + 1,
create = reqt.create
}
-- pass location header back as a hint we redirected
headers = headers or {}
headers.location = headers.location or location
return result, code, headers, status
end
This is my version
function tredirect(reqt, location)
reqt.url=url.absolute(reqt.url, location)
reqt.nredirects=(reqt.nredirects or 0) + 1
local result, code, headers, status = trequest(reqt)
-- pass location header back as a hint we redirected
headers = headers or {}
headers.location = headers.location or location
return result, code, headers, status
end
You see the difference? Instead of creating a new request i update the
current request, so get_url_save_long_file can get the informations needed.
Is the ok? Or are there any serious reasons to copy the request as in
the original version?
Looking for advice,
Regards JJvB