Changeset 3335

Show
Ignore:
Timestamp:
01/07/10 18:21:14 (10 years ago)
Author:
cemeyer
Message:

httpretrieve: Improve readability of docstrings, comments; approach code style guidelines.

Files:
1 modified

Legend:

Unmodified
Added
Removed
  • seattle/trunk/seattlelib/httpretrieve.repy

    r3332 r3335  
    1010 
    1111<Purpose> 
    12   provides a http content from a web server using http protocol. It sends a http request to 
    13   any http server through socket connection to get the http content. Then once the http server 
    14   replies with http header and content, the http header is checked for any error message. 
    15   Then provides http content in a format of string, saved in a file or as a file like object.  
     12  Provides a method for retrieving content from web servers using the HTTP 
     13  protocol. The content can be accessed as a file like object, or saved to 
     14  a file or returned as a string. 
    1615""" 
    1716 
     
    1918 
    2019 
    21 include urlparse.repy   
     20include urlparse.repy 
    2221include sockettimeout.repy 
    2322include http_hierarchy_error.repy 
     
    2827 
    2928 
    30 def httpretrieve_open(url, http_query=None, http_post=None, http_header=None, header_timeout=30, content_timeout=30, httpheader_limit=8192, httpcontent_limit=4194304): 
     29def httpretrieve_open(url, http_query=None, http_post=None, \ 
     30    http_header=None, header_timeout=30, content_timeout=30, \ 
     31    httpheader_limit=8192, httpcontent_limit=4194304): 
    3132  """ 
    3233  <Purpose> 
    33      Returns file like object that that can read the http content form a http server. The file like 
    34      object gets string from http server using read method.   
     34     Returns a file-like object that can be used to read the content from 
     35     an HTTP server. 
    3536 
    3637  <Arguments> 
    3738    url: 
    38            String of a http web server-URL 
    39     http_post: 
    40            dictionary of data to post to server(unencoded string, the library encodes the post it self) 
    41     http_query: 
    42            dictionary of query to send to server(unencoded string, the library encodes the query it self) 
    43     http_header: 
    44            dictionary of http header to add the the http header request 
    45     header_timeout: 
    46            socket timeout for receiving header from server(default value set to 30 seconds) 
    47     content_timeout: 
    48            socket timeout for receiving content from server(default value set to 30 seconds) 
    49     httpheader_limit: 
    50            length limit for when a server sends http header(default value set to 8kb(8192 charactors))  
    51     httpcontent_limit: 
    52             limits the the amount of content a server can send to the retrieval  
    53   
     39           The URL to perform a GET or POST request on. 
     40    http_post (optional): 
     41           A dictionary of form data to POST to the server. Passing 
     42           a non-None value results in a POST request being sent to the 
     43           server. 
     44    http_query (optional): 
     45           A dictionary of form data to send as a GET request's query 
     46           string to the server. 
     47 
     48           If http_post is omitted, the URL is retrieved with GET. If 
     49           both http_post and http_query are omitted, there is no query 
     50           string sent in the request. 
     51    http_header (optional): 
     52           A dictionary of supplemental HTTP request headers to add to the 
     53           request. 
     54    header_timeout (optional): 
     55           A timeout for receiving the HTTP response headers from the 
     56           server. Defaults to 30 seconds. 
     57    content_timeout (optional): 
     58           A timeout for receiving the body of the HTTP response from the 
     59           server. Defaults to 30 seconds. 
     60    httpheader_limit (optional): 
     61           An optional limit on the quantity of HTTP response headers to 
     62           accept from the server. Defaults to 8 kiB. 
     63    httpcontent_limit (optional): 
     64           An optional limit on the quantity of the HTTP response's body. 
     65           Defaults to 4 MiB. 
     66 
    5467  <Exceptions> 
    55         HttpUserInputError 
    56             ->  If given a invalid URL  
    57             ->  If given a none http protocol server 
    58             ->  if file like object read is given a negative number or a none int as a limit 
    59             ->  if the file like object is called after it is closed 
    60  
    61         HttpConnectionError 
    62             ->  if opening connection with server fails       
    63             ->  if sending http request to http server fails   
    64  
    65         HttpHeaderReceivingError 
    66             ->  If the timeout(default timeout set to 5 seconds) for receiving exceeds    
    67             ->  If the http header length exceeds (default  
    68              
    69         HttpHeaderFormatError 
    70             ->  If The http header is too long(default set to 8 kb(8192 charactors)) 
    71             ->  If the http header statuscode format is invalid. The right http header 
    72                     status code includes HTTP<version> http_status_number http_status_msg                  
    73             ->  If the http server gives a http header with a invalid content length 
    74                     or invalid redirect location 
    75              
    76         HttpContentReceivingError: 
    77             ->  if the http server fails to send http content  
    78             ->  if the timeout(default timeout set to 5 seconds) for receiving exceeds 
    79             ->  if the server socket connection fails for any reason besides connection closing 
    80                     during receiving http content 
    81                          
    82         HttpContentLengthError: 
    83             ->  if content length exceeds(default set to 4096 kb(4194304 charactors))  
    84                     (ONLY WORKS CONTENT LENGTH IS GIVEN BY THE HTTP SERVER) 
    85             ->  If the total received length is not the same as the content length(this check will fail 
    86                      if the content length isnt given) 
    87             ->  If read is called with limits and it returns a empty string but the total read is                                       
    88                      not equal to the content length(this check will fail if the content length isnt given)                 
    89                                              
    90         HttpStatuscodeError: 
    91             -> if the http response status code isnt ok or redirect, it will raise an exception 
    92                    depending up on the http protocol status number   
    93          
     68    HttpUserInputError if given an invalid URL, or malformed limit / 
     69      timeout values. This is also raised if the user attempts to call 
     70      a method on the file-like object after closing it. 
     71 
     72    HttpConnectionError if opening the connection or sending the HTTP 
     73      request fails. 
     74 
     75    HttpHeaderReceivingError if the timeout for recieving headers is 
     76      exceeded or the limit on header data size is exceeded. 
     77 
     78    HttpHeaderFormatError if the header data size is exceeded, the 
     79      response is malformed, the Content-length response header is 
     80      negative, or the Location response header is malformed. 
     81 
     82    HttpContentReceivingError if the server fails to send the content, 
     83      if the timeout for receiving is exceeded, or if the server 
     84      connection fails for any reason while receiving content. 
     85 
     86    HttpContentLengthError if the Content-length header returned by the 
     87      server exceeds httpcontent_limit, or if the Content-length header 
     88      doesn't match the amount of data sent by the server. 
     89 
     90    HttpStatuscodeError if the status code isn't 2xx or 3xx. 
    9491 
    9592  <Side Effects> 
    96     None  
     93    None 
    9794 
    9895  <Returns> 
    99     Returns file like obj which can read the http content from http web server.  
     96    Returns a file-like object which can be used to read the body of 
     97    the response from the web server. 
    10098  """ 
    101    
    102   # check if url is valid and get host, path, port and query from the given url 
     99 
     100  # Check if the URL is valid and get host, path, port and query 
    103101  (host, port, path, url_query) = _httpretrieve_parse_given_url(url) 
    104102 
    105   # get connection to the http web server 
     103  # Open connection to the web server 
    106104  try: 
    107105    sock = timeout_openconn(host, port) 
    108      
     106 
    109107  except Exception, e: 
    110108    raise HttpConnectionError('Error: opening a connection failed with given http server, Given: ' + str(url) + ' ' + str(e)) 
    111      
    112   # build a http format request using the given port, host, path and query 
    113   httpheader = _httpretrieve_buildhttprequest(http_header, port, host, path, url_query, http_query, http_post) 
    114  
    115   # send http format request to http web server   
    116   _httpretrieve_sendhttprequest(sock, httpheader)  
    117  
    118   # receive the http header lines in a form of list from the http web server 
    119   httpheaderlines = _httpretrieve_receive_httpheader(sock, header_timeout, httpheader_limit) 
    120    
    121   # get the http status number and http status msg from http header response 
    122   (http_status_number, http_status_msg) = _httpretrieve_get_httpstatuscode(httpheaderlines) 
    123  
    124   if http_status_number == '200':# ok message 
    125     # gets the content length if given. 
     109 
     110  # build an HTTP request using the given port, host, path and query 
     111  httpheader = _httpretrieve_buildhttprequest(http_header, port, host, \ 
     112      path, url_query, http_query, http_post) 
     113 
     114  # send HTTP request to the web server 
     115  _httpretrieve_sendhttprequest(sock, httpheader) 
     116 
     117  # receive the header lines from the web server 
     118  httpheaderlines = _httpretrieve_receive_httpheader(sock, \ 
     119      header_timeout, httpheader_limit) 
     120 
     121  # get the status code and status message from the HTTP response 
     122  (http_status_number, http_status_msg) = \ 
     123      _httpretrieve_get_httpstatuscode(httpheaderlines) 
     124 
     125  if http_status_number == '200': 
    126126    contentlength = _httpretrieve_get_contentlength(httpheaderlines) 
    127     # return a filelikeobj to read the http content from the http server  
    128     return _httpretrieve_filelikeobject(sock, contentlength, httpcontent_limit, content_timeout)  
    129    
    130   elif http_status_number == '301' or http_status_number == '302': # redirect 
    131     # redirect to the new location via recursion   
     127    return _httpretrieve_filelikeobject(sock, contentlength, \ 
     128        httpcontent_limit, content_timeout) 
     129 
     130  elif http_status_number == '301' or http_status_number == '302': 
     131    # redirect to the new location via recursion 
    132132    sock.close() 
    133     # get the redirection location 
    134     redirect_location = _httpretrieve_httpredirect(httpheaderlines)  
    135     # redirect to the new location using recursion 
    136     return httpretrieve_open(redirect_location)      
     133    redirect_location = _httpretrieve_httpredirect(httpheaderlines) 
     134    return httpretrieve_open(redirect_location) 
    137135 
    138136  else: 
    139     # if given receive content length inorder to check http content error is received fully 
     137    # Raise an exception detailing the status code and content of the 
     138    # page to the user. 
    140139    contentlength = _httpretrieve_get_contentlength(httpheaderlines) 
    141     # receive the http content error  
    142     http_errorcontent = _httpretrieve_receive_httperror_content(sock, contentlength) 
    143     # raise exception depending up on the http status number and add on the http error content after a 
    144     # discription that says 'Http error content: ' 
    145     _httpretrieve_raise_httpstatuscode_error(http_status_number, http_status_msg, http_errorcontent)    
    146  
    147  
    148    
    149  
    150  
    151 def httpretrieve_save_file(url, filename, http_query=None, http_post=None, http_header=None, header_timeout=30, content_timeout=30, httpheader_limit=8192, httpcontent_limit=4194304): 
     140    http_errorcontent = \ 
     141        _httpretrieve_receive_httperror_content(sock, contentlength) 
     142    _httpretrieve_raise_httpstatuscode_error(http_status_number, \ 
     143        http_status_msg, http_errorcontent) 
     144 
     145 
     146 
     147 
     148 
     149def httpretrieve_save_file(url, filename, http_query=None, http_post=None, \ 
     150    http_header=None, header_timeout=30, content_timeout=30, \ 
     151    httpheader_limit=8192, httpcontent_limit=4194304): 
    152152  """ 
    153153  <Purpose> 
    154      Saves http content of the given URL to current directory   
     154    Performs an HTTP request, and saves the content of the response to a 
     155    file. 
    155156 
    156157  <Arguments> 
    157     url: 
    158            String of a http web server URL 
    159158    filename: 
    160            The file name for the http content to be saved in 
    161     http_post: 
    162            dictionary of data to post to server(unencoded string, the library encodes the post it self) 
    163     http_query: 
    164            dictionary of query to send to server(unencoded string, the library encodes the query it self) 
    165     http_header: 
    166            dictionary of http header to add the the http header request 
    167     header_timeout: 
    168            socket timeout for receiving header from server(default value set to 30 seconds) 
    169     content_timeout: 
    170            socket timeout for receiving content from server(default value set to 30 seconds) 
    171     httpheader_limit: 
    172            length limit for when a server sends http header(default value set to 8kb(8192 charactors))  
    173     httpcontent_limit: 
    174             limits the the amount of content a server can send to the retrieval  
    175             
     159           The file name to save the response to. 
     160    Other arguments: 
     161           See documentation for httpretrieve_open(). 
     162 
    176163  <Exceptions> 
    177    
    178     HttpRetrieveClientError: cant create a file to save the http content too 
    179     Also includes:  all the exception from httpretrieve_open  
     164    HttpRetrieveClientError if we cannot create the file. 
     165 
     166    This function will all raise any exception raised by httpretrieve_open(), 
     167    for the same reasons. 
    180168 
    181169  <Side Effects> 
    182     same as httpretrieve_open   
     170    Writes the body of the response to 'filename'. 
    183171 
    184172  <Returns> 
    185     None  
     173    None 
    186174  """ 
    187    
     175 
    188176  httpcontent = '' 
    189177  try: 
    190     # create a new file with the given filename 
    191178    newfile = open(filename, 'w') 
    192179  except Exception, e: 
    193     raise HttpRetrieveClientError('Error on creating a file to saving http content' + str(e)) 
    194  
    195   http_obj = httpretrieve_open(url, http_query, http_post, http_header, header_timeout, content_timeout, httpheader_limit, httpcontent_limit) 
    196  
    197   # keep on reading 1024 and writing it into a file, until it receives an empty string 
    198   # which means the http server content is completely read  
     180    raise HttpRetrieveClientError( \ 
     181        'Error on creating a file to saving http content' + str(e)) 
     182 
     183  http_obj = httpretrieve_open(url, http_query, http_post, http_header, \ 
     184      header_timeout, content_timeout, httpheader_limit, httpcontent_limit) 
     185 
     186  # Read from the file-like HTTP object into our file. 
    199187  while True: 
    200188    httpcontent = http_obj.read(1024) 
    201189    if httpcontent == '': 
    202       # done reading close file and file like obj and exit loop  
    203       newfile.close()   
     190      # we're done reading 
     191      newfile.close() 
    204192      http_obj.close() 
    205193      break 
     
    207195 
    208196 
    209    
    210 def httpretrieve_get_string(url, http_query=None, http_post=None, http_header=None, header_timeout=30, content_timeout=30, httpheader_limit=8192, httpcontent_limit=4194304): 
     197 
     198def httpretrieve_get_string(url, http_query=None, http_post=None, \ 
     199    http_header=None, header_timeout=30, content_timeout=30, \ 
     200    httpheader_limit=8192, httpcontent_limit=4194304): 
    211201  """ 
    212202  <Purpose> 
    213      retruns string of the http content from a given URL 
     203    Performs an HTTP request on the given URL, using POST or GET, 
     204    returning the content of the response as a string. Uses 
     205    httpretrieve_open. 
    214206 
    215207  <Arguments> 
    216     url: 
    217            String of a http web server-URL 
    218     http_post: 
    219            dictionary of data to post to server(unencoded string, the library encodes the post it self) 
    220     http_query: 
    221            dictionary of query to send to server(unencoded string, the library encodes the query it self) 
    222     http_header: 
    223            dictionary of http header to add the the http header request 
    224     header_timeout: 
    225            socket timeout for receiving header from server(default value set to 30 seconds) 
    226     content_timeout: 
    227            socket timeout for receiving content from server(default value set to 30 seconds) 
    228     httpheader_limit: 
    229            length limit for when a server sends http header(default value set to 8kb(8192 charactors))  
    230     httpcontent_limit: 
    231             limits the the amount of content a server can send to the retrieval  
    232   
     208    See httpretrieve_open. 
     209 
    233210  <Exceptions> 
    234      same as httpretrieve_open 
     211    See httpretrieve_open. 
    235212 
    236213  <Side Effects> 
    237      same as httpretrieve_open  
     214    None. 
    238215 
    239216  <Returns> 
    240      returns a string of the http content.  
    241   """        
    242    
    243   http_obj = httpretrieve_open(url, http_query, http_post, http_header, header_timeout, content_timeout, httpheader_limit, httpcontent_limit) 
    244   # retrieve the http content from server using httpretrieve file like object 
     217    Returns the body of the HTTP response (no headers). 
     218  """ 
     219 
     220  http_obj = httpretrieve_open(url, http_query, http_post, http_header, \ 
     221      header_timeout, content_timeout, httpheader_limit, httpcontent_limit) 
    245222  httpcontent = http_obj.read() 
    246223  http_obj.close() 
    247   # return the http content in a form of string 
    248224  return httpcontent 
    249225 
     
    251227 
    252228class _httpretrieve_filelikeobject: 
    253   # file like object used to receive the http content with a length limit    
     229  # This class implements a file-like object used for performing HTTP 
     230  # requests and retrieving responses. 
     231 
    254232  def __init__(self, sock, contentlength, httpcontent_limit, content_timeout): 
    255233    self.sock = sock 
     
    259237      self.contentlengthisknown = True 
    260238      self.contentlength = contentlength 
    261     self.httpcontent_limit = httpcontent_limit   
    262     self.content_timeout = content_timeout   
     239    self.httpcontent_limit = httpcontent_limit 
     240    self.content_timeout = content_timeout 
    263241    self.fileobjclosed = False 
    264     self.totalcontentisreceived = False  
     242    self.totalcontentisreceived = False 
    265243    self.totalread = 0 
    266      
     244 
     245 
    267246 
    268247  def read(self, limit = None): 
    269248    """ 
    270249    <Purpose> 
    271         reads the http content from http server using the file like object    
     250      Behaves like Python's file.read(), with the potential to raise 
     251      additional informative exceptions. 
    272252 
    273253    <Arguments> 
    274         limit(optional): 
    275              maximum number of bytes to read. If not specified the whole file is read. 
    276              Can be 0 or any positive int 
    277     
     254      limit (optional): 
     255            The maximum amount of data to read. If omitted or None, this 
     256            reads all available data. 
     257 
    278258    <Exceptions> 
    279  
    280         HttpContentReceivingError: 
    281             ->  if the http server fails to send http content  
    282             ->  if the timeout(default timeout set to 5 seconds) for receiving exceeds 
    283             ->  if the server socket connection fails for any reason besides connection closing 
    284                     during receiving http content 
    285         HttpContentLengthError: 
    286             ->  if content length exceeds(default set to 4096 kb(4194304 charactors)) 
    287                     (ONLY WORKS CONTENT LENGTH IS GIVEN BY THE HTTP SERVER) 
    288             ->  If the total received length is not the same as the content length(this check will fail 
    289                      if the content length isnt given) 
    290             ->  If read is called with limits and it returns a empty string but the total read is                                       
    291                      not equal to the content length(this check will fail if the content length isnt given) 
    292                       
    293         HttpStatuscodeError: 
    294             if the http response status code isnt ok or redirect, it will raise an exception 
    295             depending up on the http protocol status number   
    296  
     259      See file.read()'s documentation, as well as that of 
     260      httpretrieve_open(). 
    297261 
    298262    <Side Effects> 
    299        None 
    300  
     263      None. 
    301264 
    302265    <Returns> 
    303       returns the content of http server in a form of string or an empty string if the content is completely read. 
     266      See file.read(). 
    304267    """ 
    305      
    306     # raises an exception, if read is called after the filelikeobj is closed  
     268 
    307269    if self.fileobjclosed == True: 
    308270      raise HttpUserInputError('Http Error: filelikeobj is closed') 
    309271 
    310     # if read is called after all http content received return an empty string 
    311272    if self.totalcontentisreceived: 
    312273      return '' 
    313     
    314     # check if limit is given 
    315     if limit == None:  
    316       readhaslimit = False  
     274 
     275    if limit == None: 
     276      readhaslimit = False 
    317277      left_to_read = 1024 
    318     else:  
    319     # check if limit is a valid number 
     278    else: 
    320279      if not type(webpage_content) == int: 
    321         raise HttpUserInputError('User input Error: given a none int to receive' + str(e))   
     280        raise HttpUserInputError( \ 
     281            'User input Error: given a none int to receive' + str(e)) 
    322282      elif limit < 0: 
    323         # raise an exception if limit is a negative number 
    324         raise HttpUserInputError('User input Error: given a negative number to receive, given: ' + str(limit)) 
     283        raise HttpUserInputError( \ 
     284            'User input Error: given a negative number to receive, given: ' + \ 
     285            str(limit)) 
    325286      readhaslimit = True 
    326       left_to_read = limit   
    327  
    328     # set a timeout for receiveing content from server  
     287      left_to_read = limit 
     288 
    329289    self.sock.settimeout(self.content_timeout) 
    330290 
    331     # if limit is given it will receiveby subtracting what is left until the limit is reached  
    332     # if limit isnt given it will receive1024 until the server closes connection        
     291    # Try to read up to limit, or until there is nothing left. 
    333292    httpcontent = '' 
    334293    while True: 
     
    337296 
    338297      except SocketTimeoutError: 
    339         # raise an exception if receiveis taking too long to respond 
    340         self.sock.close()   
    341         raise HttpContentReceivingError('Timeout Error on receiving content: server taking too long to send content')   
    342  
    343       except Exception , e: 
    344         # socket closed - signal for when the server is done sending content 
    345         # if there is any other exceptions close connection and raise an error   
    346         if 'Socket closed' not in str(e):  
    347           self.sock.close()             
     298        self.sock.close() 
     299        raise HttpContentReceivingError( \ 
     300            'Timeout Error on receiving content: server taking too long to send content') 
     301 
     302      except Exception, e: 
     303        if 'Socket closed' not in str(e): 
     304          self.sock.close() 
    348305          raise HttpContentReceivingError('Error on receiving content:' + str(e)) 
     306 
    349307        self.totalcontentisreceived = True 
    350308        break 
    351309 
    352310      else: 
    353         # By default, httpretrieve permits content length to be less than 4,096 kilobytes(4194304 charactors) 
    354311        if len(content) >= self.httpcontent_limit: 
    355           raise HttpContentLengthError('content length exceeded ' + self.httpcontent_limit) 
    356  
    357         # add what is received 
     312          raise HttpContentLengthError('content length exceeded ' + \ 
     313              self.httpcontent_limit) 
     314 
    358315        httpcontent += content 
    359316        if readhaslimit: 
    360           # keep subtracting what is left to receieve until it reachs the given limit amount 
    361317          self.totalread += len(content) 
    362318          if len(content) == left_to_read: 
     
    365321            left_to_read -= len(content) 
    366322 
    367     # check if there was an error during reciving http content 
     323    # Check if there was an error receiving the HTTP response. 
    368324    self._check_recieving_error(readhaslimit, httpcontent) 
    369325 
    370326    return httpcontent 
    371    
     327 
     328 
    372329 
    373330  def close(self): 
    374331    """ 
    375332    <Purpose> 
    376       close the file like object  
     333      Close the file-like object. 
    377334 
    378335    <Arguments> 
    379336      None 
    380     
     337 
    381338    <Exceptions> 
    382       None  
     339      None 
    383340 
    384341    <Side Effects> 
    385       closes socket connection for the http client to http server 
     342      Disconnects from the HTTP server. 
    386343 
    387344    <Returns> 
    388       Nothing  
    389     """    
    390     self.fileobjclosed = True# flag used to raise an exception if the file like object is called after closed 
     345      Nothing 
     346    """ 
     347    self.fileobjclosed = True 
    391348    self.sock.close() 
     349 
    392350 
    393351 
     
    395353    if len(httpcontent) == 0: 
    396354      self.sock.close() 
    397       raise HttpContentLengthError('Error on recieving content: received a http header but didnt receive any http content') 
    398      
    399     if self.contentlengthisknown:                     
    400       # if limit is given and content length is given and total content is received, check the total read equals the content length     
     355      raise HttpContentLengthError('Error on recieving content: ' + \ 
     356          'received a http header but didnt receive any http content') 
     357 
     358    if self.contentlengthisknown: 
    401359      if readhaslimit and self.totalcontentisreceived: 
    402360        if self.totalread != self.contentlength: 
    403           self.sock.close()                 
    404           raise HttpContentLengthError('Total length read with limit did not match the content length: total read: ' + str(self.totalread) + ' content length: ' + str(self.contentlength)) 
    405  
    406       # if called read without limit and content length is given; check if the received length is the same as the content length        
     361          self.sock.close() 
     362          raise HttpContentLengthError('Total length read with limit ' + \ 
     363              'did not match the content length: total read: ' + \ 
     364              str(self.totalread) + ' content length: ' + \ 
     365              str(self.contentlength)) 
     366 
    407367      if readhaslimit == False: 
    408368        if len(httpcontent) != self.contentlength: 
    409369          self.sock.close() 
    410           raise HttpContentLengthError('Total received length did not match the content length: received: ' + str(len(httpcontent)) + ' content length : ' + str(self.contentlength))      
    411  
     370          raise HttpContentLengthError('Total received length did not ' + \ 
     371              'match the content length: received: ' + \ 
     372              str(len(httpcontent)) + ' content length : ' + \ 
     373              str(self.contentlength)) 
    412374 
    413375 
     
    415377 
    416378def _httpretrieve_parse_given_url(url): 
    417   # checks if the URL is in the right format and returns a string of host, port, path and query by parsing the URL   
     379  # Checks that the URL is in the right format and returns a tuple of host, 
     380  # port, path and query. 
    418381  try: 
    419    # returns a dictionary of {scheme, netloc, path, quer, fragment, username, password, hostname and port} form the url                      
    420     urlparse = urlparse_urlsplit(url)   
     382    urlparse = urlparse_urlsplit(url) 
    421383  except Exception, e: 
    422384    raise HttpUserInputError('Given URL error: ' + str(e)) 
    423385  else: 
    424     # check if the protocol is http  
    425386    if urlparse['scheme'] != 'http': 
    426       raise HttpUserInputError('Given URL error: the given protocol ' + urlparse['scheme'] + ' isnt supported')        
     387      raise HttpUserInputError('Given URL error: the given protocol ' + \ 
     388          urlparse['scheme'] + ' isnt supported') 
    427389    if urlparse['hostname'] == None: 
    428       raise HttpUserInputError('Given URL error: host name is not given')  
    429  
    430     # get only the host path, port, query from the urlparse dictionary 
     390      raise HttpUserInputError('Given URL error: host name is not given') 
     391 
    431392    host = urlparse['hostname'] 
    432393    path = urlparse['path'] 
    433394    query = urlparse['query'] 
    434      
    435     # use default port 80 if the port isnt given                     
    436     if urlparse['port'] == None: 
    437       port = 80  
    438     else: 
    439       port = urlparse['port'] 
    440  
    441     return host, port, path, query 
    442  
    443  
    444  
    445  
    446  
    447 def _httpretrieve_buildhttprequest(http_header, port, host, path, url_query, dict_query, http_post): 
    448   # send http request to the http web server using socket connection   
    449    
     395    port = urlparse.get('port', 80) 
     396 
     397    return (host, port, path, query) 
     398 
     399 
     400 
     401 
     402def _httpretrieve_buildhttprequest(http_header, port, host, path, url_query, 
     403    dict_query, http_post): 
     404  # Sends the HTTP request. 
     405 
    450406  if http_post != None: 
    451     # there is a posted data, thus use http POST command 
    452     
    453     # check if the given post data is valid 
     407    # There is a posted data, use HTTP POST. 
     408 
    454409    if not type(http_post) == dict: 
    455       raise HttpUserInputError('The given http_post is not a dictionary, given: ' + str(type(http_post))) 
    456  
    457     # change the given http post dictionary into a encoded post data with a key and val   
    458     try:  
     410      raise HttpUserInputError('The given http_post is not a ' + \ 
     411          'dictionary, given: ' + str(type(http_post))) 
     412 
     413    # Convert the dictionary of form values into a POST message body. 
     414    try: 
    459415      http_post = urllib_quote_parameters(http_post) 
    460416    except Exception, e: 
    461       raise HttpUserInputError('Error encoding the given http post dictionary ' +  str(http_post) + str(e)) 
    462  
    463  
    464     # build the main http request header which includes the GET/POST and the Host name field 
    465     httpheader = _httpretrieve_httprequestmain_header('POST', url_query, dict_query, path, host, port) 
    466  
    467     # if given add a client http header to the request 
     417      raise HttpUserInputError('Error encoding the given http post ' + \ 
     418          'dictionary ' + str(http_post) + str(e)) 
     419 
     420    # Build the minimal HTTP request header -- includes only the request 
     421    # and the Host field. 
     422    httpheader = _httpretrieve_httprequestmain_header('POST', url_query, \ 
     423        dict_query, path, host, port) 
     424 
     425    # Build the rest of the request. 
    468426    httpheader += _httpretrieve_parse_clienthttpheader(http_header) 
    469  
    470     # indicate the http post content length 
    471     httpheader += 'Content-Length: ' + str(len(http_post)) + '\r\n'   
    472     # add a new line to indicate that the http header is done and the http post is followed. 
     427    httpheader += 'Content-Length: ' + str(len(http_post)) + '\r\n' 
    473428    httpheader += '\r\n' 
    474     # include the posted data after the http header empty line 
    475429    httpheader += http_post 
    476          
    477430 
    478431  else: 
    479     # there is no posted data, use http GET method    
    480     httpheader = _httpretrieve_httprequestmain_header('GET', url_query, dict_query, path, host, port) 
    481     # add client header if given  
     432    # There is no posted data, use HTTP GET. 
     433    httpheader = _httpretrieve_httprequestmain_header('GET', url_query, \ 
     434        dict_query, path, host, port) 
    482435    httpheader += _httpretrieve_parse_clienthttpheader(http_header) 
    483     # add a new line to indicate that the header request is complete 
    484436    httpheader += '\r\n' 
    485437 
    486  
    487   # return header with a new line which is signal for http header is done  
    488   return httpheader  
    489  
    490  
    491  
    492  
    493 def _httpretrieve_httprequestmain_header(http_command, url_query, dict_query, path, host, port): 
    494   # builds the first two main http request headers which include the GET/POST and the HOST name 
    495    
    496   # before building the httprequest make sure there isnt two fields of query given by the client 
     438  # return header with a new line which is signal for http header is done 
     439  return httpheader 
     440 
     441 
     442 
     443 
     444def _httpretrieve_httprequestmain_header(http_command, url_query, \ 
     445    dict_query, path, host, port): 
     446  # Builds a minimal HTTP request, returning it as a string. 
     447 
     448  # Sanity check -- the user should have only given us one set of data. 
    497449  if url_query != '' and dict_query != None: 
    498     # cant have two different fields with query 
    499     raise HttpUserInputError('Cant input a http query with the url and an extra parameter dictionary with a http query') 
     450    raise HttpUserInputError('Cant input a http query with the url and ' + \ 
     451        'an extra parameter dictionary with a http query') 
    500452 
    501453  elif dict_query != None: 
    502     # user has given a http query  
    503     try:  
     454    # Send form data via GET. 
     455    try: 
    504456      encoded_query = '?' + urllib_quote_parameters(dict_query) 
    505457    except Exception, e: 
    506       raise HttpUserInputError('Error encoding the given http query dictionary ' +  str(dict_query) + str(e)) 
     458      raise HttpUserInputError('Error encoding the given http ' + \ 
     459          'query dictionary ' + str(dict_query) + str(e)) 
    507460 
    508461  elif url_query != '': 
    509     # if there is a query include the query on the main header('?' is used as a seperater between path) 
     462    # Send an arbitrary string via GET. 
    510463    encoded_query = '?' + url_query 
    511464  else: 
    512     # there is no query 
    513465    encoded_query = '' 
    514      
    515  
    516   # if there is no path add a '/' on the request and if there is a path use the given path 
     466 
     467  # A non-empty path is a required part of an HTTP request. 
    517468  addpath = '/' 
    518469  if path != '': 
    519470    addpath = path 
    520471 
    521   # FIRST header which includes the POST/GET request   
    522   main_httpheader = http_command + ' ' + addpath + encoded_query + ' HTTP/1.0\r\n' 
    523  
    524  
    525   # if port is 80, dont need to include the port upon request 
     472  main_httpheader = http_command + ' ' + addpath + encoded_query + \ 
     473      ' HTTP/1.0\r\n' 
     474 
     475  # We don't need to include the port in the Host header if it is 80. 
    526476  addport = '' 
    527477  if port != 80: 
    528     # if the port is not 80 the host needs to include the port number on Host header 
    529     # (':' is used as a separater between host and port) 
    530478    addport = ':' + str(port) 
    531479 
    532   # SECOND line of the header request which include the host name with port if the port is not 80    
    533480  main_httpheader += 'Host: ' + host + addport + '\r\n' 
    534  
    535   # return the firs two lines of the http request  
    536481  return main_httpheader 
    537482 
     
    540485 
    541486def _httpretrieve_parse_clienthttpheader(http_header): 
    542   # builds a http header from the given http header dictionary 
     487  # Converts a dictionary of HTTP request headers into a string. 
     488 
    543489  if http_header == None: 
    544     # if the http header isnt given return a empty string 
    545490    return '' 
     491 
    546492  elif not type(http_header) == dict: 
    547     # raise an exception if the http header isnt dictionary 
    548     raise HttpUserInputError('The given http_post is not a dictionary, given: ' + str(type(http_header))) 
    549   else:  
    550     # take the given key and val from the http_header dictionary and add them to the http header with 
    551     # correct http format 
     493    raise HttpUserInputError('The given http_post is not a dictionary, ' + \ 
     494        'given: ' + str(type(http_header))) 
     495 
     496  else: 
    552497    clienthttpheader = '' 
    553498    for key, val in http_header.items(): 
    554       # if the key is not capital letter raise an exception 
    555       clienthttpheader += key + ' : ' + val + '\r\n'   
    556  
    557     # return the string of the http header   
     499      clienthttpheader += key + ' : ' + val + '\r\n' 
    558500    return clienthttpheader 
    559501 
     
    562504 
    563505def _httpretrieve_sendhttprequest(sock, httpheader): 
    564   # send the request, and if there is any error raise an excetion 
     506  # Send the HTTP request; raise an exception on error. 
    565507  try: 
    566508    sock.send(httpheader) 
    567509  except Exception, e: 
    568510    sock.close() 
    569     raise HttpConnectionError('Connection error: on sending http request to server ' + str(e)) 
    570  
    571  
    572  
    573    
     511    raise HttpConnectionError('Connection error: on sending http ' + \ 
     512        'request to server ' + str(e)) 
     513 
     514 
     515 
     516 
    574517def _httpretrieve_receive_httpheader(sock, header_timeout, httpheader_limit): 
    575   # receives the http header leaving alone rest of the http response and returns in list 
    576   # of each header as a line. default httpheader limit is set to 8 kb                         
    577  
    578   # set a time out if the server fails to send http header  
     518  # Receives the HTTP headers only. Returns them as a list of strings. 
     519 
    579520  sock.settimeout(header_timeout) 
    580521 
    581   httpheader_received = 0  
    582   httpheader = ''  
     522  httpheader_received = 0 
     523  httpheader = '' 
    583524  while True: 
    584     # receive until a empty line (\n\n or \r\n\r\n ) which separates the 
    585     # http header from the http content 
     525    # CRLFCRLF separates the HTTP headers from the body of the response. 
    586526    if '\r\n\r\n' in httpheader: 
    587       # return split to return a list of httpheader lines 
    588527      return httpheader.split('\r\n') 
     528 
     529    # Against the HTTP spec, we also accept LFLF as a mark of reaching the 
     530    # end of the headers. 
    589531    if '\n\n' in httpheader: 
    590       # return split to return a list of httpheader lines 
    591532      return httpheader.split('\n') 
    592533 
    593534    if httpheader_limit == httpheader_received: 
    594       sock.close()                   
    595       raise HttpHeaderFormatError('Http header length Error: The http header is too long, exceeded 8 kb') 
    596                          
     535      sock.close() 
     536      raise HttpHeaderFormatError('Http header length Error: The http ' + \ 
     537          'header is too long, exceeded 8 kb') 
     538 
    597539    try: 
    598       # receive one character at a time inorder to check for the empty line 
    599540      content = sock.recv(1) 
    600       # keep track of the received characters to raise an exception if the limit is exceeded                   
    601       httpheader_received += 1                   
     541      httpheader_received += 1 
    602542 
    603543    except SocketTimeoutError: 
    604       raise HttpHeaderReceivingError('Timeout Error on receiving header: server taking too long to send http header') 
     544      raise HttpHeaderReceivingError('Timeout Error on receiving ' + \ 
     545          'header: server taking too long to send http header') 
     546 
    605547    except Exception, e: 
    606       sock.close()  
    607       raise HttpHeaderReceivingError('Error on recieving http header: ' + str(e)) 
     548      sock.close() 
     549      raise HttpHeaderReceivingError('Error on recieving http ' + \ 
     550          'header: ' + str(e)) 
     551 
    608552    else: 
    609       # if there was not receiving error add keep on adding the receieved content 
    610553      httpheader += content 
    611554 
     
    613556 
    614557 
    615 def _httpretrieve_get_httpstatuscode(httpHeaderLines):  
    616   # checks if the http status code is valid and return the status number and msg  
    617  
    618   # http response header includes 3 "words": HTTP<version> http_status_number http_status_msg  
     558def _httpretrieve_get_httpstatuscode(httpHeaderLines): 
     559  # Checks if the status code does not indicate an error. 
     560 
     561  # The first line of an HTTP response is composed of: 
     562  # HTTP<version> http_status_number http_status_msg 
    619563  httpstatusheader = httpHeaderLines[0] 
    620564  headersplit = httpstatusheader.split(' ', 2) 
    621565 
    622   # length of the header has to be 3 or greater because depending up on the http_status_msg might be more than one word 
    623566  if len(headersplit) != 3: 
    624     raise HttpHeaderFormatError('Invalid Http header status code format: Correct format is HTTP<version> http_status_number http_status_msg: Given '  + httpstatusheader) 
     567    raise HttpHeaderFormatError('Invalid Http header status code ' + \ 
     568        'format: Correct format is HTTP<version> http_status_number ' + \ 
     569        'http_status_msg: Given '  + httpstatusheader) 
    625570  if not httpstatusheader.startswith('HTTP'): 
    626     raise HttpHeaderFormatError('Invalid Http header status code format: Http header status code should start of with HTTP<version> but given: '  + httpstatusheader) 
    627  
    628   # the first split is the http version 
    629   http_version = headersplit[0]                       
    630  
    631   # check if http_status_number is valid int 
    632   try:  
     571    raise HttpHeaderFormatError('Invalid Http header status code ' + \ 
     572        'format: Http header status code should start of with ' + \ 
     573        'HTTP<version> but given: '  + httpstatusheader) 
     574 
     575  http_version = headersplit[0] 
     576 
     577  try: 
    633578    int(headersplit[1]) 
    634579  except ValueError, e: 
    635     raise HttpHeaderFormatError('Invalid Http header status code format: Status number should be a int, Given: ' + str(headersplit[1]) + str(e)) 
     580    raise HttpHeaderFormatError('Invalid Http header status code ' + \ 
     581        'format: Status number should be a int, Given: ' + \ 
     582        str(headersplit[1]) + str(e)) 
    636583  else: 
    637584    http_status_number = headersplit[1] 
    638    
    639   # what ever is left is the http status msg 
     585 
    640586  http_status_msg = headersplit[2] 
    641    
    642   # return the values 
    643587  return http_status_number, http_status_msg 
    644588 
     
    647591 
    648592def _httpretrieve_receive_httperror_content(sock, contentlength): 
    649   # receives the http error content which is located after the http error header                         
    650   httperror_content = ''  
     593  # Receive the error message (this is called when the server returns an 
     594  # 'error' response). 
     595 
     596  httperror_content = '' 
    651597  while True: 
    652598    try: 
    653       content = sock.recv(1024)                   
     599      content = sock.recv(1024) 
    654600 
    655601    except SocketTimeoutError: 
    656       raise HttpContentReceivingError('Timeout Error on receiving http error conent: server taking too long to send http error content') 
     602      raise HttpContentReceivingError('Timeout Error on receiving http ' + \ 
     603          'error content: server taking too long to send http error content') 
    657604    except Exception, e: 
    658       # socket closed - signal for when the server is done sending error content 
    659       # if there is any other exceptions close connection and raise an error besides socket closing   
    660       if 'Socket closed' not in str(e):  
    661         sock.close()             
    662         raise HttpContentReceivingError('Error on receiving http error content: ' + str(e)) 
     605      if 'Socket closed' not in str(e): 
     606        sock.close() 
     607        raise HttpContentReceivingError('Error on receiving http error ' + \ 
     608            'content: ' + str(e)) 
    663609      break 
    664          
     610 
    665611    else: 
    666       # if there was not a receiving error keep on adding the receieved content 
    667612      httperror_content += content 
    668613 
    669   # return the received http error content. If the content length is given check 
    670   # if the content length maches the received content 
    671614  if contentlength != None: 
    672615    if contentlength != len(httperror_content): 
    673       raise HttpContentLengthError('Error on receiving http error content: received conent length: ' + str(len(httperror_content)) + ' actual content length: ' + str(contentlength))    
    674   return httperror_content       
    675  
    676  
    677  
    678      
    679 def _httpretrieve_raise_httpstatuscode_error(http_status_number, http_status_msg, http_errorcontent):  
    680   # raises an exception using the http_status_number 1xx for Informational, 2xx for Success 3xx for Redirection, 
    681   # 4xx for Client Error, and 5xx Server Error 
    682   
    683   # raise a detailed error message explaining the http_status_number and http_status_msg for popular http errors   
     616      raise HttpContentLengthError('Error on receiving http error ' + \ 
     617          'content: received conent length: ' + \ 
     618          str(len(httperror_content)) + ' actual content length: ' + \ 
     619          str(contentlength)) 
     620  return httperror_content 
     621 
     622 
     623 
     624 
     625def _httpretrieve_raise_httpstatuscode_error(http_status_number, http_status_msg, http_errorcontent): 
     626  # Raises an exception for the status code. 
     627 
     628  # Arbitrarily chosen individual status codes: 
    684629  if http_status_number == '202': 
    685     raise HttpError202('Http response error: ' + http_status_number + ' ' + http_status_msg +  ' http proccesing not responding. Http error content: ' + http_errorcontent)       
     630    raise HttpError202('Http response error: ' + http_status_number + ' ' + http_status_msg +  ' http proccesing not responding. Http error content: ' + http_errorcontent) 
    686631  elif http_status_number == '204': 
    687     raise HttpError204('Http response error: ' + http_status_number + ' ' + http_status_msg +  ' thier is no http body content. Http error content: ' + http_errorcontent)  
     632    raise HttpError204('Http response error: ' + http_status_number + ' ' + http_status_msg +  ' thier is no http body content. Http error content: ' + http_errorcontent) 
    688633  elif http_status_number == '300': 
    689     raise HttpError300('Http response error: ' + http_status_number + ' ' + http_status_msg +  ' multiple redirect isnt suported. Http error content: ' + http_errorcontent)                 
     634    raise HttpError300('Http response error: ' + http_status_number + ' ' + http_status_msg +  ' multiple redirect isnt suported. Http error content: ' + http_errorcontent) 
    690635  elif http_status_number == '404': 
    691     raise HttpError404('Http response error: ' + http_status_number + ' ' + http_status_msg +  ' cant find anything matching the given url. Http error content: ' + http_errorcontent)                 
     636    raise HttpError404('Http response error: ' + http_status_number + ' ' + http_status_msg +  ' cant find anything matching the given url. Http error content: ' + http_errorcontent) 
    692637  elif http_status_number == '403': 
    693     raise HttpError403('Http response error: ' + http_status_number + ' ' + http_status_msg +  ' the request was illegal. Http error content: ' + http_errorcontent)                 
     638    raise HttpError403('Http response error: ' + http_status_number + ' ' + http_status_msg +  ' the request was illegal. Http error content: ' + http_errorcontent) 
    694639  elif http_status_number == '400': 
    695     raise HttpError400('Http response error: ' + http_status_number + ' ' + http_status_msg +  ' the request contians bad syntex. Http error content: ' + http_errorcontent)                 
     640    raise HttpError400('Http response error: ' + http_status_number + ' ' + http_status_msg +  ' the request contians bad syntex. Http error content: ' + http_errorcontent) 
    696641  elif http_status_number == '500': 
    697     raise HttpError500('Http response error: ' + http_status_number + ' ' + http_status_msg +  ' The server encountered an unexpected condition. Http error content: ' + http_errorcontent)                  
     642    raise HttpError500('Http response error: ' + http_status_number + ' ' + http_status_msg +  ' The server encountered an unexpected condition. Http error content: ' + http_errorcontent) 
    698643  elif http_status_number == '502': 
    699     raise HttpError502('Http response error: ' + http_status_number + ' ' + http_status_msg +  ' acting like a gateway received an invalid response. Http error content: ' + http_errorcontent)                 
    700  
    701   # if the http number wasnt any of the popular http error msgs, raise an exception using 
    702   # the defualt http status number with http status msg   
    703   elif http_status_number >= '100' and http_status_number < '200':  
    704     raise HttpError1xx('Http response error: Information ' + http_status_number + ' ' + http_status_msg + '.Http error content: ' + http_errorcontent)  
    705   elif http_status_number > '200' and http_status_number < '300':  
     644    raise HttpError502('Http response error: ' + http_status_number + ' ' + http_status_msg +  ' acting like a gateway received an invalid response. Http error content: ' + http_errorcontent) 
     645 
     646  # Ranges: 
     647  elif http_status_number >= '100' and http_status_number < '200': 
     648    raise HttpError1xx('Http response error: Information ' + http_status_number + ' ' + http_status_msg + '.Http error content: ' + http_errorcontent) 
     649  elif http_status_number > '200' and http_status_number < '300': 
    706650    raise HttpError2xx('Http response error: success error ' + http_status_number + ' ' + http_status_msg + '.Http error content: ' + http_errorcontent) 
    707   elif http_status_number >= '300' and http_status_number < '400':  
     651  elif http_status_number >= '300' and http_status_number < '400': 
    708652    raise HttpError3xx('Http response error: Redirection error' + http_status_number + ' ' + http_status_msg + '.Http error content: ' + http_errorcontent) 
    709   elif http_status_number >= '400' and http_status_number < '500':  
    710     raise HttpError4xx('Http response error: client error ' + http_status_number + ' ' + http_status_msg + '.Http error content: ' + http_errorcontent)   
    711   elif http_status_number >= '500' and http_status_number < '600':  
     653  elif http_status_number >= '400' and http_status_number < '500': 
     654    raise HttpError4xx('Http response error: client error ' + http_status_number + ' ' + http_status_msg + '.Http error content: ' + http_errorcontent) 
     655  elif http_status_number >= '500' and http_status_number < '600': 
    712656    raise HttpError5xx('Http response error: server error: ' + http_status_number + ' ' + http_status_msg + '.Http error content: ' + http_errorcontent) 
    713657  else: 
    714     raise HttpStatusCodeError('Http response error: invalid http status response, given ' + http_status_number + '.Http error content: ' + http_errorcontent)   
     658    raise HttpStatusCodeError('Http response error: invalid http status response, given ' + http_status_number + '.Http error content: ' + http_errorcontent) 
    715659 
    716660 
     
    718662 
    719663def _httpretrieve_httpredirect(httpheaderlines): 
    720   # given http header retruns the redirect location    
    721  
    722   # if there is a redirect location given by the server it will look like 
    723   # eg.'Location: http://www.google.com' 
     664  # Determine redirect location from response headers. 
     665 
    724666  for headerline in httpheaderlines: 
    725     if headerline.startswith('Location:'): 
    726       # if found redirect need to strip out 'Location: ' to return the url    
     667    if headerline.startswith('Location: '): 
    727668      redirect = headerline[len('Location: '):] 
    728                          
    729       # check if the redirect has given a location then return it 
     669 
    730670      if len(redirect) == 0: 
    731         raise HttpHeaderFormatError('Http header redirection format error: http server gave a redierect location with no URL') 
     671        raise HttpHeaderFormatError('Http header redirection format ' + \ 
     672            'error: http server gave a redierect location with no URL') 
    732673      return redirect 
    733                          
    734   # there wasn't a redirect location given 
    735   raise HttpHeaderFormatError('Http header redirection format error: http redirect header didnt include the location')   
     674 
     675  raise HttpHeaderFormatError('Http header redirection format error: ' + \ 
     676      'http redirect header didnt include the location') 
    736677 
    737678 
     
    739680 
    740681def _httpretrieve_get_contentlength(httpheaderlines): 
    741   # returns the content legth if given or returns None if not given by server 
    742  
    743   # if there is a content length given by the server it will look like 
    744   # eg.'Content-Length: 34' 
     682  # Determines the value of the Content-Length header. 
     683 
    745684  for headerline in httpheaderlines: 
    746     if headerline.startswith('Content-Length:'): 
    747       # if found content length need to strip out 'Content-Length: ' to return the length    
    748        
    749       try:  
     685    if headerline.startswith('Content-Length: '): 
     686      try: 
    750687        contentlength = int(headerline[len('Content-Length: '):]) 
    751688      except ValueError, e: 
    752         raise HttpHeaderFormatError('Http header Content-Length format error: http server provided content length that isnt a int ' + str(e))                 
    753  
    754        
    755       # check if the content length is valid and retrun it  
    756       if contentlength <= 0:                 
    757         raise HttpHeaderFormatError('Http header Content-Length format error: provided content length with invalid number ' + str(contentlength)) 
     689        raise HttpHeaderFormatError('Http header Content-Length format ' + \ 
     690            'error: http server provided content length that isnt a int ' + \ 
     691            str(e)) 
     692 
     693      if contentlength <= 0: 
     694        raise HttpHeaderFormatError('Http header Content-Length format ' + \ 
     695            'error: provided content length with invalid number ' + \ 
     696            str(contentlength)) 
    758697      else: 
    759698        return contentlength 
    760                          
    761   # there wasn't a content-length line or the content length was given but didnt give a int  
     699 
    762700  return None