Ticket #982 (closed task: fixed)

Opened 8 years ago

Last modified 7 years ago

encode could possibly be unsafe...

Reported by: justinc Owned by: mkaplan
Priority: major Milestone:
Component: repy Version: 0.1t
Severity: Medium Keywords:
Cc: monzum, justinc Blocking:
Blocked By:

Description

Some of the encode types might be unsafe. We should probably restrict repy so that the encoding type is only something well tested like utf-8.

This needs to happen on both the string method encode and also the first line of files / strings.

I'll apply a patch to Repy V1 and Alex can use this patch to determine how to apply this to Repy V2.

Thanks to Maciej Fijalkowski for mentioning this potential issue.

Attachments

check_encoding.py Download (1.6 KB) - added by mkaplan 7 years ago.
module with methods to check the encoding of a file.

Change History

Changed 8 years ago by justinc

  • cc justinc added; alexjh removed
  • owner changed from justinc to alexjh
  • status changed from new to assigned

I've addressed this in r4360.

I spent the most time trying to find a good unit test for disabling the comment encoding string. Finally I stumbled upon rot13 encoding...

I'm kicking the ticket over to Alex for the V2 changes.

Changed 7 years ago by justinc

  • owner changed from alexjh to mkaplan

Changed 7 years ago by mkaplan

  • status changed from assigned to accepted

Changed 7 years ago by mkaplan

module with methods to check the encoding of a file.

Changed 7 years ago by mkaplan

Unfortunately, prepending the code with "# coding: utf-8\n\n" would cause the traceback (in both cases) to have the incorrect line number.

Sample program 1 (runtime error):

log(foo)

Sample program 2 (static-analysis error):

print "foo"

Changed 7 years ago by mkaplan

A possible solution based on Python 3.1's tokenize.detect_encoding() is here:  https://bitbucket.org/ned/coveragepy/src/1d44060ab103/coverage/phystokens.py

In order to be most secure, even though it breaks the line numbers, we will stick to the safer version, and prepend coding: UTF-8, as is done is r4360.

Changed 7 years ago by mkaplan

  • status changed from accepted to closed
  • resolution set to fixed

Fixed in r5715.

Note: See TracTickets for help on using tickets.