我尝试将Erlang的httpc模块用于高并发请求
我在spawn中处理许多请求的代码都不起作用:
-module(t).
-compile(export_all).
start() ->
ssl:start(),
inets:start( httpc, [{profile, default}] ),
httpc:set_options([{max_sessions, 200}, {pipeline_timeout, 20000}], default),
{ok, Device} = file:open("c:\urls.txt", read),
read_each_line(Device).
read_each_line(Device) ->
case io:get_line(Device, "") of
eof -> file:close(Device);
Line -> go( string:substr(Line, 1,length(Line)-1)),
read_each_line(Device)
end.
go(Url)->
spawn(t,geturl, [Url] ).
geturl(Url)->
UrlHTTP=lists:concat(["http://www.", Url]),
io:format(UrlHTTP),io:format("~n"),
{ok, RequestId}=httpc:request(get,{UrlHTTP,[{"User-Agent", "Opera/9.80 (Windows NT 6.1; U; ru) Presto/2.8.131 Version/11.10"}]}, [],[{sync, false}]),
receive
{http, {RequestId, {_HttpOk, _ResponseHeaders, Body}}} -> io:format("ok"),ok
end.
httpc:html正文中未收到请求-如果可以在中使用spawn
go(Url)->
spawn(t,geturl, [Url] ).
http://erlang.org/doc/man/httpc.html
备注
如果可能的话,客户端将保持其连接活动,并根据配置和当前环境使用带或不带管道的持久连接。HTTP/1.1规范没有提供在持久连接上发送多少请求是理想的准则,这在很大程度上取决于应用。请注意,一个很长的请求队列可能会导致用户感觉到的延迟,因为较早的请求可能需要很长时间才能完成。1规范确实建议每个服务器限制为2个持久连接,这是max_sessions选项的缺省值
urls.txt包含不同的URL-例如
google.com
amazon.com
alibaba.com
...
你怎么了?
1条答案
按热度按时间kmb7vmvb1#
Your code never actually starts the
httpc
service (andinets
, the application that it depends on), and the confusion probably comes from the unfortunate overloading of theinets:start/[0,1,2,3]
function:inets:start/[0,1]
starts theinets
application itself and thehttpc
service with the default profile (calleddefault
).inets:start/[2,3]
(which should be calledstart_service
) starts one of the services that can run atopinets
(viz.ftpc
,tftp
,httpc
,httpd
) once theinets
application has already started.start() ->
start(Type) -> ok | {error, Reason}
Starts the Inets application.
start(Service, ServiceConfig) -> {ok, Pid} | {error, Reason}
start(Service, ServiceConfig, How) -> {ok, Pid} | {error, Reason}
Dynamically starts an Inets service after the Inets application has been started
(with
inets:start/[0,1]
).So your spawned process simply crashed when trying to call
httpc:request/4
as the service itself was not running. To illustrate,inets:start( httpc, [{profile, default}] )
from yourstart/0
function would fail to startinets
and thehttpc
service:You should check the returned value of application start to track potential problems:
Or, if the application could be already started, use a function like this:
[edit 2 - small code enhancement]
I have tested this code and it works on my PC. Note that you are using a '' in the string for file access, this is an escape sequence that make the line to fail.
and in the console:
Note that thanks to the
ensure_start/1
you can launch the application twice.I have tested also with a bad url and it is detected.
My test include only 3 urls, and I guess that if there are many urls, the time to get the response will increase, because the loop to spawn processes is faster to execute than the request themselves. So you must expect at some point some timeout issue. There may be also some limitation in the http client, I didn't check the doc for this particular point.