64 Commits
0.9 ... stubs

Author SHA1 Message Date
dfca46886f filled in stub files 2018-07-26 15:30:02 -04:00
6d527a2f1d added type hint stubs for all modules 2018-07-26 15:24:20 -04:00
80d6104387 conflict 2018-07-26 15:08:08 -04:00
Martin Rusev
0464a5a74d Merge pull request #134 from zevaverbach/126_128_130
Added flags to messages, moved classes out of __init__.py, and documented that the inbox is where Imbox.messages are from without any folder argument.
2018-07-26 21:06:51 +02:00
5b27fa1b76 removed errant dash that was causing Travis-CI to fail 2018-07-26 11:06:48 -04:00
9cffb51a81 Added missing documentation for all supported query keyword arguments. Fixes #124. Added Pycharm directory to .gitignore.
Fixed var names in documentation of query keywords

moved Messages and Imbox to their own modules, imported Imbox.imbox into __init__.py and put it in __all__. fixes #130.

clarified in documentation and Imbox.messages logging that, unless a folder is specified in the kwargs to Imbox.messages, the returned messages will be from the inbox.  In the documentation this is accomplished exclusively by the var names. fixes #128.

amended `8df7d7c` to reflect manual changes made to `README.rst` in current master, but also added `inbox_` to several var names to make that explicit in the documentation.  Added flags to messages returned by `fetch_email_by_uid`, using the new function `parse_flags` in `parser.py`.  Fixes #126.

added TODO back into query.py
2018-07-26 10:50:10 -04:00
Martin Rusev
06aa4e054b Update README.rst 2018-07-26 14:47:02 +02:00
Martin Rusev
63bddbd73c Update README.rst 2018-07-26 14:45:34 +02:00
2838639bfa conflict in .gitignore 2018-07-25 14:25:14 -04:00
Martin Rusev
34149efed5 Merge pull request #129 from zevaverbach/encapsulate_generator
Created Message class to encapsulate the "messages" generator
2018-07-25 19:46:49 +02:00
73fafcb368 made Messages methods private, as well as uid_list. 2018-07-25 12:11:59 -04:00
8e26e92a39 Moved query_ids from Imbox to Messages.
Moved `fetch_list` to `Messages` and renamed `fetch_email_list`, moving assignment of uid_list to the constructor of `Messages`.

Moved `fetch_by_uid` from `Imbox` to a pure function in `parser` module.

Replaced call to `Imbox.fetch_list` with a call to `Messages` instead.

created `Messages` class, which encapsulates the generator `Messages.fetch_email_list`, while also implementing `__len__` to show how many emails match a given query, and `__iter__` to refresh the `fetch_email_list` generator when it's exhausted.  It also implements `__getitem__` to support indexing of the emails matching a query.  Messages requires the `connection` and `parser_policy` established in `Imbox` and accepts arbitrary keyword arguments, which it uses in the IMAP query as well as in the `__repr__`.
2018-07-25 11:48:19 -04:00
635d15441e shortened up the ImapTransport constructor, a couple of similar cleanups 2018-07-25 09:00:21 -04:00
55f64a1922 removed some unused args, renamed to avoid shadowing a name in the global scope. 2018-07-25 08:46:59 -04:00
71942a69e8 fixed var names in documentation of query keywords 2018-07-25 08:27:40 -04:00
b4cb03e145 added Pycharm directory to .gitignore 2018-07-25 08:24:05 -04:00
b53a1e6837 Added missing documentation for all supported query keyword arguments. Fixes #124. 2018-07-25 08:22:50 -04:00
Martin Rusev
a1801af56e Merge pull request #119 from sblondon/factorize-version
Reuse __version__ for the library version
2018-04-05 10:21:07 +02:00
Stephane Blondon
6a0da4b105 reuse __version__ for the library version 2018-04-04 18:27:27 +02:00
Martin Rusev
5d09b6f3e4 Merge pull request #117 from sblondon/fix-encoding-in-logger
Fix encoding in logger
2018-04-04 15:43:56 +02:00
Stephane Blondon
ecb62b585c refactoring: string formatting more readable 2018-03-21 18:31:08 +01:00
Stephane Blondon
cbb46ef078 fix decoding of sender e-mail if badly encoded 2018-03-21 18:24:59 +01:00
Stephane Blondon
79ce81aa9d fix spelling error 2018-03-19 20:56:22 +01:00
Martin Rusev
96ce737df5 Merge pull request #112 from wagner-certat/version
Expose version in library
2018-01-08 12:56:01 +01:00
Sebastian Wagner
a94682fc3c Expose version in library 2018-01-08 12:07:21 +01:00
martinrusev
85abe48c8c Update version 2017-12-05 19:08:59 +01:00
martinrusev
9956d182eb Update changelog 2017-12-05 18:56:09 +01:00
Martin Rusev
ef68d92021 Update README.rst 2017-11-30 23:01:03 +01:00
Martin Rusev
43a37818ab Merge pull request #108 from balsagoth/feature/starttls
Add starttls support
2017-11-30 23:00:36 +01:00
Martin Rusev
7b1bda6126 Merge pull request #109 from balsagoth/feature/query_improve
add date__on to query and some other improvments
2017-11-30 23:00:27 +01:00
Ivan Pereira
fe965e7d19 add date__on to query and some other improvments 2017-11-30 21:42:16 +00:00
Ivan Pereira
cfca92df60 Update docs 2017-11-30 16:30:43 +00:00
Ivan Pereira
5664e9c48a Add suport for starttls 2017-11-30 16:14:17 +00:00
Martin Rusev
6c11c759c0 Merge pull request #107 from memanikantan/master
Query and mark emails as flagged/starred.
2017-10-29 09:46:36 +01:00
Manikantan Ramchandran
2f72aa13df typo fix 2017-10-28 17:17:49 +05:30
Manikantan Ramchandran
ffd4550524 added tests for flagged and unflagged queries 2017-10-28 17:16:44 +05:30
Manikantan Ramchandran
14c592136e added mark as flagged or starred feature 2017-10-28 16:59:11 +05:30
Manikantan Ramchandran
85025b34ff added flagged and unflagged query options 2017-10-28 16:58:05 +05:30
Martin Rusev
ed251ce999 Typo in Readme 2017-10-28 10:05:10 +02:00
Martin Rusev
6dc0e7b3f2 Merge pull request #106 from sblondon/master
add infos about how to list folders, delete a message...
2017-10-28 10:04:24 +02:00
Stéphane Blondon
79601bef7a add infos about how to list folders, delete a message and mark a message as read 2017-10-27 18:45:04 +02:00
Martin Rusev
88143c7a1b Merge pull request #105 from sblondon/master
Remove double-quotes around attachment filename
2017-10-13 16:32:01 +02:00
Stéphane Blondon
7c6cc2fb5f Remove double-quotes around attachment filename 2017-10-13 16:02:57 +02:00
Martin Rusev
5460bec4b5 Merge pull request #104 from sblondon/master
Messages filter can use date objects instead of stringified dates
2017-10-11 08:45:10 +02:00
Stéphane Blondon
ea4fd7d9ea Messages filter can use date objects instead of stringified dates 2017-10-10 21:52:05 +02:00
Martin Rusev
c64bbd73e9 Merge pull request #102 from sblondon/master
Add info about how to run tests
2017-10-10 08:11:20 +02:00
Stephane Blondon
acfb2adc47 Add info about how to run tests; remove python 3.2 from supported versions 2017-10-09 22:21:15 +02:00
Martin Rusev
050e2d8f7f Merge pull request #98 from GetHappie/parse-attachment-fix
Fix parsing attachment
2017-10-09 13:13:16 +02:00
Martin Rusev
6e0ee232fe Merge pull request #100 from sblondon/master
Fix attachment parsing when a semicolon character ends the Content-Di…
2017-10-06 15:26:23 +02:00
Stéphane Blondon
a490081e57 fix attachment parsing when a semicolon character ends the Content-Disposition line 2017-10-06 14:58:08 +02:00
Andrey Mozgunov
a8b5ce1c31 Update test case 2017-10-02 13:21:16 +03:00
Martin Rusev
7c5a639cc8 Merge pull request #96 from GetHappie/unicode-decode-fix
Fix UnicecodeDecodeError parsing email
2017-09-28 17:11:13 +02:00
Andrey Mozgunov
a78641c79a Fix parsing attachment 2017-09-28 14:12:35 +03:00
Andrey
7dad0edb72 Merge branch 'master' into unicode-decode-fix 2017-09-28 13:25:36 +03:00
Martin Rusev
2ed728485b Merge pull request #97 from GetHappie/inline-body-fix
Fix inline body parsing
2017-09-27 14:42:01 +02:00
Andrey Mozgunov
4559149dc0 Fix content disposition reading for python < 3.5 2017-09-27 13:42:43 +03:00
Andrey
0a98b960ac Merge branch 'master' into inline-body-fix 2017-09-26 15:56:14 +03:00
Andrey Mozgunov
ed86228e86 Fix inline body parsing 2017-09-26 15:51:34 +03:00
Andrey
eadddd6c0b Merge branch 'master' into unicode-decode-fix 2017-09-26 14:43:21 +03:00
Andrey Mozgunov
878c7991bf Fix UnicecodeDecodeError parsing email 2017-09-26 14:38:33 +03:00
Martin Rusev
8a537de2f9 Merge pull request #95 from GetHappie/base64-decode-fix
Fix decoding base64 byte encoded params
2017-09-25 15:50:08 +02:00
Andrey Mozgunov
da450551d4 Fix decoding base64 byte encoded params 2017-09-25 14:03:39 +03:00
Martin Rusev
61a6c87fe6 Merge pull request #94 from wagner-certat/include-changelog
Include and run tests, include changelog in sdist
2017-09-18 12:41:28 +02:00
Sebastian Wagner
05683b3765 Include and run tests, include changelog in sdist
quite useful for packaging
2017-09-18 10:38:09 +02:00
22 changed files with 681 additions and 142 deletions

3
.gitignore vendored
View File

@@ -31,3 +31,6 @@ nosetests.xml
example.*
example.py
# PyCharm
.idea/

View File

@@ -1,3 +1,16 @@
## 0.9.5 (5 December 2017)
IMPROVEMENTS:
* `date__on` support: ([#109](https://github.com/martinrusev/imbox/pull/109))
* Starttls support: ([#108](https://github.com/martinrusev/imbox/pull/108))
* Mark emails as flagged/starred: ([#107](https://github.com/martinrusev/imbox/pull/107))
* Messages filter can use date objects instead of stringified dates: ([#104](https://github.com/martinrusev/imbox/pull/104))
* Fix attachment parsing when a semicolon character ends the Content-Disposition line: ([#100](https://github.com/martinrusev/imbox/pull/100))
* Parsing - UnicecodeDecodeError() fixes: ([#96](https://github.com/martinrusev/imbox/pull/96))
* Imbox() `with` support: ([#92](https://github.com/martinrusev/imbox/pull/92))
## 0.9 (18 September 2017)
IMPROVEMENTS:

View File

@@ -1,3 +1,5 @@
include LICENSE
include MANIFEST.in
include README.rst
include CHANGELOG.md
graft tests

View File

@@ -12,7 +12,7 @@ Python library for reading IMAP mailboxes and converting email content to machin
Requirements
------------
Python (3.2, 3.3, 3.4, 3.5, 3.6)
Python (3.3, 3.4, 3.5, 3.6)
Installation
@@ -34,32 +34,55 @@ Usage
username='username',
password='password',
ssl=True,
ssl_context=None) as imbox:
ssl_context=None,
starttls=False) as imbox:
# Gets all messages
all_messages = imbox.messages()
# Get all folders
status, folders_with_additional_info = imbox.folders()
# Gets all messages from the inbox
all_inbox_messages = imbox.messages()
# Unread messages
unread_messages = imbox.messages(unread=True)
unread_inbox_messages = imbox.messages(unread=True)
# Flagged messages
inbox_flagged_messages = imbox.messages(flagged=True)
# Un-flagged messages
inbox_unflagged_messages = imbox.messages(unflagged=True)
# Flagged messages
flagged_messages = imbox.messages(flagged=True)
# Un-flagged messages
unflagged_messages = imbox.messages(unflagged=True)
# Messages sent FROM
messages_from = imbox.messages(sent_from='martin@amon.cx')
inbox_messages_from = imbox.messages(sent_from='sender@example.org')
# Messages sent TO
messages_from = imbox.messages(sent_to='martin@amon.cx')
inbox_messages_to = imbox.messages(sent_to='receiver@example.org')
# Messages received before specific date
messages_from = imbox.messages(date__lt='31-July-2013')
inbox_messages_received_before = imbox.messages(date__lt=datetime.date(2018, 7, 31))
# Messages received after specific date
messages_from = imbox.messages(date__gt='30-July-2013')
inbox_messages_received_after = imbox.messages(date__gt=datetime.date(2018, 7, 30))
# Messages received on a specific date
inbox_messages_received_on_date = imbox.messages(date__on=datetime.date(2018, 7, 30))
# Messages from a specific folder
messages_folder = imbox.messages(folder='Social')
messages_from_folder = imbox.messages(folder='Social')
# Messages whose subjects contain a string
inbox_messages_subject_christmas = imbox.messages(subject='Christmas')
# Messages from a specific folder
messages_in_folder_social = imbox.messages(folder='Social')
for uid, message in all_messages:
for uid, message in all_inbox_messages:
# Every message is an object with the following keys
message.sent_from
@@ -113,5 +136,34 @@ Usage
'subject': u 'Hello John, How are you today'
}
# With the message id, several actions on the message are available:
# delete the message
imbox.delete(uid)
# mark the message as read
imbox.mark_seen(uid)
Changelog
---------
`Changelog <https://github.com/martinrusev/imbox/blob/master/CHANGELOG.md>`_
Running the tests
-----------------
You can run the imbox tests with ``tox``.
Requirements:
* the supported python versions
* ``tox``. Tox is packaged in Debian and derivatives distributions.
On Ubuntu, you can install several python versions with:
.. code:: sh
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt update
sudo apt install python3.X

View File

@@ -1,89 +1,8 @@
from imbox.imap import ImapTransport
from imbox.parser import parse_email
from imbox.query import build_search_query
from imbox.imbox import Imbox
import logging
logger = logging.getLogger(__name__)
__version_info__ = (0, 9, 5)
__version__ = '.'.join([str(x) for x in __version_info__])
class Imbox:
__all__ = ['Imbox']
def __init__(self, hostname, username=None, password=None, ssl=True,
port=None, ssl_context=None, policy=None):
self.server = ImapTransport(hostname, ssl=ssl, port=port,
ssl_context=ssl_context)
self.hostname = hostname
self.username = username
self.password = password
self.parser_policy = policy
self.connection = self.server.connect(username, password)
logger.info("Connected to IMAP Server with user {username} on {hostname}{ssl}".format(
hostname=hostname, username=username, ssl=(" over SSL" if ssl else "")))
def __enter__(self):
return self
def __exit__(self, type, value, traceback):
self.logout()
def logout(self):
self.connection.close()
self.connection.logout()
logger.info("Disconnected from IMAP Server {username}@{hostname}".format(
hostname=self.hostname, username=self.username))
def query_uids(self, **kwargs):
query = build_search_query(**kwargs)
message, data = self.connection.uid('search', None, query)
if data[0] is None:
return []
return data[0].split()
def fetch_by_uid(self, uid):
message, data = self.connection.uid('fetch', uid, '(BODY.PEEK[])')
logger.debug("Fetched message for UID {}".format(int(uid)))
raw_email = data[0][1]
email_object = parse_email(raw_email, policy=self.parser_policy)
return email_object
def fetch_list(self, **kwargs):
uid_list = self.query_uids(**kwargs)
logger.debug("Fetch all messages for UID in {}".format(uid_list))
for uid in uid_list:
yield (uid, self.fetch_by_uid(uid))
def mark_seen(self, uid):
logger.info("Mark UID {} with \\Seen FLAG".format(int(uid)))
self.connection.uid('STORE', uid, '+FLAGS', '(\\Seen)')
def delete(self, uid):
logger.info("Mark UID {} with \\Deleted FLAG and expunge.".format(int(uid)))
mov, data = self.connection.uid('STORE', uid, '+FLAGS', '(\\Deleted)')
self.connection.expunge()
def copy(self, uid, destination_folder):
logger.info("Copy UID {} to {} folder".format(int(uid), str(destination_folder)))
return self.connection.uid('COPY', uid, destination_folder)
def move(self, uid, destination_folder):
logger.info("Move UID {} to {} folder".format(int(uid), str(destination_folder)))
if self.copy(uid, destination_folder):
self.delete(uid)
def messages(self, *args, **kwargs):
folder = kwargs.get('folder', False)
msg = ""
if folder:
self.connection.select(folder)
msg = " from folder '{}'".format(folder)
logger.info("Fetch list of messages{}".format(msg))
return self.fetch_list(**kwargs)
def folders(self):
return self.connection.list()

View File

@@ -8,24 +8,20 @@ logger = logging.getLogger(__name__)
class ImapTransport:
def __init__(self, hostname, port=None, ssl=True, ssl_context=None):
def __init__(self, hostname, port=None, ssl=True, ssl_context=None, starttls=False):
self.hostname = hostname
self.port = port
kwargs = {}
if ssl:
self.transport = IMAP4_SSL
if not self.port:
self.port = 993
self.port = port or 993
if ssl_context is None:
ssl_context = pythonssllib.create_default_context()
kwargs["ssl_context"] = ssl_context
self.server = IMAP4_SSL(self.hostname, self.port, ssl_context=ssl_context)
else:
self.transport = IMAP4
if not self.port:
self.port = 143
self.port = port or 143
self.server = IMAP4(self.hostname, self.port)
self.server = self.transport(self.hostname, self.port, **kwargs)
if starttls:
self.server.starttls()
logger.debug("Created IMAP4 transport for {host}:{port}"
.format(host=self.hostname, port=self.port))

13
imbox/imap.pyi Normal file
View File

@@ -0,0 +1,13 @@
from imaplib import IMAP4, IMAP4_SSL
from ssl import SSLContext
from typing import Optional, Union, Tuple, List
class ImapTransport:
def __init__(self, hostname: str, port: Optional[int], ssl: bool,
ssl_context: Optional[SSLContext], starttls: bool) -> None: ...
def list_folders(self) -> Tuple[str, List[bytes]]: ...
def connect(self, username: str, password: str) -> Union[IMAP4, IMAP4_SSL]: ...

72
imbox/imbox.py Normal file
View File

@@ -0,0 +1,72 @@
from imbox.imap import ImapTransport
from imbox.messages import Messages
import logging
logger = logging.getLogger(__name__)
class Imbox:
def __init__(self, hostname, username=None, password=None, ssl=True,
port=None, ssl_context=None, policy=None, starttls=False):
self.server = ImapTransport(hostname, ssl=ssl, port=port,
ssl_context=ssl_context, starttls=starttls)
self.hostname = hostname
self.username = username
self.password = password
self.parser_policy = policy
self.connection = self.server.connect(username, password)
logger.info("Connected to IMAP Server with user {username} on {hostname}{ssl}".format(
hostname=hostname, username=username, ssl=(" over SSL" if ssl or starttls else "")))
def __enter__(self):
return self
def __exit__(self, type, value, traceback):
self.logout()
def logout(self):
self.connection.close()
self.connection.logout()
logger.info("Disconnected from IMAP Server {username}@{hostname}".format(
hostname=self.hostname, username=self.username))
def mark_seen(self, uid):
logger.info("Mark UID {} with \\Seen FLAG".format(int(uid)))
self.connection.uid('STORE', uid, '+FLAGS', '(\\Seen)')
def mark_flag(self, uid):
logger.info("Mark UID {} with \\Flagged FLAG".format(int(uid)))
self.connection.uid('STORE', uid, '+FLAGS', '(\\Flagged)')
def delete(self, uid):
logger.info("Mark UID {} with \\Deleted FLAG and expunge.".format(int(uid)))
self.connection.expunge()
def copy(self, uid, destination_folder):
logger.info("Copy UID {} to {} folder".format(int(uid), str(destination_folder)))
return self.connection.uid('COPY', uid, destination_folder)
def move(self, uid, destination_folder):
logger.info("Move UID {} to {} folder".format(int(uid), str(destination_folder)))
if self.copy(uid, destination_folder):
self.delete(uid)
def messages(self, **kwargs):
folder = kwargs.get('folder', False)
if folder:
self.connection.select(folder)
msg = " from folder '{}'".format(folder)
else:
msg = " from inbox"
logger.info("Fetch list of messages{}".format(msg))
return Messages(connection=self.connection,
parser_policy=self.parser_policy,
**kwargs)
def folders(self):
return self.connection.list()

31
imbox/imbox.pyi Normal file
View File

@@ -0,0 +1,31 @@
import datetime
from email._policybase import Policy
from inspect import Traceback
from ssl import SSLContext
from typing import Optional, Union, Tuple, List
class Imbox:
def __init__(self, hostname: str, username: Optional[str], password: Optional[str], ssl: bool,
port: Optional[int], ssl_context: Optional[SSLContext], policy: Optional[Policy], starttls: bool): ...
def __enter__(self) -> 'Imbox': ...
def __exit__(self, type: Exception, value: str, traceback: Traceback) -> None: ...
def logout(self) -> None: ...
def mark_seen(self, uid: bytes) -> None: ...
def mark_flag(self, uid: bytes) -> None: ...
def delete(self, uid: bytes) -> None: ...
def copy(self, uid: bytes, destination_folder: Union[bytes, str]) -> Tuple[str, Union[list, List[None, bytes]]]: ...
def move(self, uid: bytes, destination_folder: Union[bytes, str]) -> None: ...
def messages(self, **kwargs: Union[bool, str, datetime.date]) -> 'Messages': ...
def folders(self) -> Tuple[str, List[bytes]]: ...

62
imbox/messages.py Normal file
View File

@@ -0,0 +1,62 @@
from imbox.parser import fetch_email_by_uid
from imbox.query import build_search_query
import logging
logger = logging.getLogger(__name__)
class Messages:
def __init__(self,
connection,
parser_policy,
**kwargs):
self.connection = connection
self.parser_policy = parser_policy
self.kwargs = kwargs
self._uid_list = self._query_uids(**kwargs)
logger.debug("Fetch all messages for UID in {}".format(self._uid_list))
def _fetch_email(self, uid):
return fetch_email_by_uid(uid=uid,
connection=self.connection,
parser_policy=self.parser_policy)
def _query_uids(self, **kwargs):
query_ = build_search_query(**kwargs)
message, data = self.connection.uid('search', None, query_)
if data[0] is None:
return []
return data[0].split()
def _fetch_email_list(self):
for uid in self._uid_list:
yield uid, self._fetch_email(uid)
def __repr__(self):
if len(self.kwargs) > 0:
return 'Messages({})'.format('\n'.join('{}={}'.format(key, value)
for key, value in self.kwargs.items()))
return 'Messages(ALL)'
def __iter__(self):
return self._fetch_email_list()
def __next__(self):
return self
def __len__(self):
return len(self._uid_list)
def __getitem__(self, index):
uids = self._uid_list[index]
if not isinstance(uids, list):
uid = uids
return uid, self._fetch_email(uid)
return [(uid, self._fetch_email(uid))
for uid in uids]

28
imbox/messages.pyi Normal file
View File

@@ -0,0 +1,28 @@
import datetime
from email._policybase import Policy
from imaplib import IMAP4, IMAP4_SSL
from typing import Union, List, Generator, Tuple
class Messages:
def __init__(self,
connection: Union[IMAP4, IMAP4_SSL],
parser_policy: Policy,
**kwargs: Union[bool, str, datetime.date]) -> None: ...
def _fetch_email(self, uid: bytes) -> 'Struct': ...
def _query_uids(self, **kwargs: Union[bool, str, datetime.date]) -> List[bytes]: ...
def _fetch_email_list(self) -> Generator[Tuple[bytes, 'Struct']]: ...
def __repr__(self) -> str: ...
def __iter__(self) -> Generator[Tuple[bytes, 'Struct']]: ...
def __next__(self) -> 'Messages': ...
def __len__(self) -> int: ...
def __getitem__(self, index) -> Union['Struct', List['Struct']]: ...

View File

@@ -1,14 +1,17 @@
import imaplib
import io
import re
import email
import base64
import quopri
import sys
import time
from datetime import datetime
from email.header import decode_header
from imbox.utils import str_encode, str_decode
import logging
logger = logging.getLogger(__name__)
@@ -33,7 +36,10 @@ def decode_mail_header(value, default_charset='us-ascii'):
return str_decode(str_encode(value, default_charset, 'replace'), default_charset)
else:
for index, (text, charset) in enumerate(headers):
logger.debug("Mail header no. {}: {} encoding {}".format(index, str_decode(text, charset or 'utf-8'), charset))
logger.debug("Mail header no. {index}: {data} encoding {charset}".format(
index=index,
data=str_decode(text, charset or 'utf-8', 'replace'),
charset=charset))
try:
headers[index] = str_decode(text, charset or default_charset,
'replace')
@@ -54,7 +60,7 @@ def get_mail_addresses(message, header_name):
for index, (address_name, address_email) in enumerate(addresses):
addresses[index] = {'name': decode_mail_header(address_name),
'email': address_email}
logger.debug("{} Mail addressees in message: <{}> {}".format(header_name.upper(), address_name, address_email))
logger.debug("{} Mail address in message: <{}> {}".format(header_name.upper(), address_name, address_email))
return addresses
@@ -63,13 +69,13 @@ def decode_param(param):
values = v.split('\n')
value_results = []
for value in values:
match = re.search(r'=\?((?:\w|-)+)\?(Q|B)\?(.+)\?=', value)
match = re.search(r'=\?((?:\w|-)+)\?([QB])\?(.+)\?=', value)
if match:
encoding, type_, code = match.groups()
if type_ == 'Q':
value = quopri.decodestring(code)
elif type_ == 'B':
value = base64.decodestring(code)
value = base64.decodebytes(code.encode())
value = str_encode(value, encoding)
value_results.append(value)
if value_results:
@@ -82,7 +88,11 @@ def parse_attachment(message_part):
# Check again if this is a valid attachment
content_disposition = message_part.get("Content-Disposition", None)
if content_disposition is not None and not message_part.is_multipart():
dispositions = content_disposition.strip().split(";")
dispositions = [
disposition.strip()
for disposition in content_disposition.split(";")
if disposition.strip()
]
if dispositions[0].lower() in ["attachment", "inline"]:
file_data = message_part.get_payload(decode=True)
@@ -97,13 +107,14 @@ def parse_attachment(message_part):
attachment['filename'] = filename
for param in dispositions[1:]:
name, value = decode_param(param)
if param:
name, value = decode_param(param)
if 'file' in name:
attachment['filename'] = value
if 'file' in name:
attachment['filename'] = value[1:-1] if value.startswith('"') else value
if 'create-date' in name:
attachment['create-date'] = value
if 'create-date' in name:
attachment['create-date'] = value
return attachment
@@ -119,9 +130,31 @@ def decode_content(message):
return content
def fetch_email_by_uid(uid, connection, parser_policy):
message, data = connection.uid('fetch', uid, '(BODY.PEEK[] FLAGS)')
logger.debug("Fetched message for UID {}".format(int(uid)))
raw_headers, raw_email = data[0]
email_object = parse_email(raw_email, policy=parser_policy)
flags = parse_flags(raw_headers.decode())
email_object.__dict__['flags'] = flags
return email_object
def parse_flags(headers):
"""Copied from https://github.com/girishramnani/gmail/blob/master/gmail/message.py"""
if len(headers) == 0:
return []
if sys.version_info[0] == 3:
headers = bytes(headers, "ascii")
return list(imaplib.ParseFlags(headers))
def parse_email(raw_email, policy=None):
if isinstance(raw_email, bytes):
raw_email = str_encode(raw_email, 'utf-8')
raw_email = str_encode(raw_email, 'utf-8', errors='ignore')
if policy is not None:
email_parse_kwargs = dict(policy=policy)
else:
@@ -132,9 +165,7 @@ def parse_email(raw_email, policy=None):
except UnicodeEncodeError:
email_message = email.message_from_string(raw_email.encode('utf-8'), **email_parse_kwargs)
maintype = email_message.get_content_maintype()
parsed_email = {}
parsed_email['raw_email'] = raw_email
parsed_email = {'raw_email': raw_email}
body = {
"plain": [],
@@ -154,7 +185,7 @@ def parse_email(raw_email, policy=None):
content = decode_content(part)
is_inline = content_disposition is None \
or content_disposition == "inline"
or content_disposition.startswith("inline")
if content_type == "text/plain" and is_inline:
body['plain'].append(content)
elif content_type == "text/html" and is_inline:

30
imbox/parser.pyi Normal file
View File

@@ -0,0 +1,30 @@
import datetime
from email._policybase import Policy
from email.message import Message
from imaplib import IMAP4_SSL
import io
from typing import Union, Dict, List, KeysView, Tuple, Optional
class Struct:
def __init__(self, **entries: Union[
str, datetime.datetime, Dict[str, str], list, List[Dict[str, str]]
]) -> None: ...
def keys(self) -> KeysView: ...
def __repr__(self) -> str: ...
def decode_mail_header(value: str, default_charset: str) -> str: ...
def get_mail_addresses(message: Message, header_name: str) -> List[Dict[str, str]]: ...
def decode_param(param: str) -> Tuple[str, str]: ...
def parse_attachment(message_part: Message) -> Optional[Dict[str, Union[int, str, io.BytesIO]]]: ...
def decode_content(message: Message) -> str: ...
def fetch_email_by_uid(uid: bytes, connection: IMAP4_SSL, parser_policy: Optional[Policy]) -> Struct: ...
def parse_email(raw_email: bytes, policy: Optional[Policy]) -> Struct: ...

View File

@@ -4,27 +4,24 @@ import logging
logger = logging.getLogger(__name__)
IMAP_MONTHS = ["Jan", "Feb", "Mar", "Apr", "May", "Jun",
"Jul", "Aug", "Sep", "Oct", "Nov", "Dec"]
def format_date(date):
return "%s-%s-%s" % (date.day, IMAP_MONTHS[date.month - 1], date.year)
if isinstance(date, datetime.date):
return date.strftime('%d-%b-%Y')
return date
def build_search_query(**kwargs):
# Parse keyword arguments
unread = kwargs.get('unread', False)
unflagged = kwargs.get('unflagged', False)
flagged = kwargs.get('flagged', False)
sent_from = kwargs.get('sent_from', False)
sent_to = kwargs.get('sent_to', False)
date__gt = kwargs.get('date__gt', False)
if type(date__gt) is datetime.date:
date__gt = format_date(date__gt)
date__lt = kwargs.get('date__lt', False)
if type(date__lt) is datetime.date:
date__lt = format_date(date__lt)
date__on = kwargs.get('date__on', False)
subject = kwargs.get('subject')
query = []
@@ -32,6 +29,12 @@ def build_search_query(**kwargs):
if unread:
query.append("(UNSEEN)")
if unflagged:
query.append("(UNFLAGGED)")
if flagged:
query.append("(FLAGGED)")
if sent_from:
query.append('(FROM "%s")' % sent_from)
@@ -39,10 +42,13 @@ def build_search_query(**kwargs):
query.append('(TO "%s")' % sent_to)
if date__gt:
query.append('(SINCE "%s")' % date__gt)
query.append('(SINCE "%s")' % format_date(date__gt))
if date__lt:
query.append('(BEFORE "%s")' % date__lt)
query.append('(BEFORE "%s")' % format_date(date__lt))
if date__on:
query.append('(ON "%s")' % format_date(date__on))
if subject is not None:
query.append('(SUBJECT "%s")' % subject)

7
imbox/query.pyi Normal file
View File

@@ -0,0 +1,7 @@
import datetime
from typing import Union
def format_date(date: Union[str, datetime.date]) -> str: ...
def build_search_query(**kwargs: Union[bool, str, datetime.date]) -> str: ...

View File

@@ -1,14 +1,16 @@
import logging
logger = logging.getLogger(__name__)
def str_encode(value='', encoding=None, errors='strict'):
logger.debug("Encode str {} with and errors {}".format(value, encoding, errors))
return str(value, encoding, errors)
def str_decode(value='', encoding=None, errors='strict'):
if isinstance(value, str):
return bytes(value, encoding, errors).decode('utf-8')
elif isinstance(value, bytes):
return value.decode(encoding or 'utf-8', errors=errors)
else:
raise TypeError( "Cannot decode '{}' object".format(value.__class__) )
raise TypeError("Cannot decode '{}' object".format(value.__class__))

6
imbox/utils.pyi Normal file
View File

@@ -0,0 +1,6 @@
from typing import Optional, Union
def str_encode(value: Union[str, bytes], encoding: Optional[str], errors: str) -> str: ...
def str_decode(value: Union[str, bytes], encoding: Optional[str], errors: str) -> Union[str, bytes]: ...

View File

@@ -1,7 +1,9 @@
from setuptools import setup
import os
version = '0.9'
import imbox
version = imbox.__version__
def read(filename):
@@ -22,10 +24,10 @@ setup(
zip_safe=False,
classifiers=(
'Programming Language :: Python',
'Programming Language :: Python :: 3.2',
'Programming Language :: Python :: 3.3',
'Programming Language :: Python :: 3.4',
'Programming Language :: Python :: 3.5',
'Programming Language :: Python :: 3.6'
),
test_suite='tests',
)

22
tests/8422.msg Normal file
View File

@@ -0,0 +1,22 @@
Delivered-To: receiver@example.com
Return-Path: <sender@example.com>
Date: Thu, 20 Jul 2017 07:34:22 -0500
Message-ID: <59705CFE.A95F.0016.0@journeys.com>
Subject: Following up Re: Looking to connect, let's schedule a call!
From: sender@example.com
To: "Receiver" <receiver@example.com>
Mime-Version: 1.0
Content-Type: multipart/mixed; boundary="=__PartBD85995F.0__="
This is a MIME message. If you are reading this text, you may want to
consider changing to a mail reader or gateway that understands how to
properly handle MIME multipart messages.
--=__PartBD85995F.0__=
Content-Type: multipart/alternative; boundary="=__PartBD85995F.1__="
--=__PartBD85995F.1__=
Content-Type: text/plain; charset=Windows-1252
Content-Transfer-Encoding: 8bit
Following up on my previous message. Id love to connect you with

View File

@@ -1,6 +1,7 @@
import unittest
from imbox.parser import *
import os
import sys
if sys.version_info.minor < 3:
SMTP = False
@@ -8,6 +9,9 @@ else:
from email.policy import SMTP
TEST_DIR = os.path.dirname(os.path.abspath(__file__))
raw_email = """Delivered-To: johndoe@gmail.com
X-Originating-Email: [martin@amon.cx]
Message-ID: <test0@example.com>
@@ -80,6 +84,197 @@ Content-Transfer-Encoding: quoted-printable
"""
raw_email_encoded_multipart = b"""Delivered-To: receiver@example.com
Return-Path: <kkoudelka@wallvet.com>
Date: Tue, 08 Aug 2017 08:15:11 -0700
From: <kkoudelka@wallvet.com>
To: interviews+347243@gethappie.me
Message-Id: <20170808081511.2b876c018dd94666bcc18e28cf079afb.99766f164b.wbe@email24.godaddy.com>
Subject: RE: Kari, are you open to this?
Mime-Version: 1.0
Content-Type: multipart/related;
boundary="=_7c18e0b95b772890a22ed6c0f810a434"
--=_7c18e0b95b772890a22ed6c0f810a434
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html; charset="utf-8"
<html><body><span style=3D"font-family:Verdana; color:#000; font-size:10pt;=
"><div>Hi Richie,</div></span></body></html>
--=_7c18e0b95b772890a22ed6c0f810a434
Content-Transfer-Encoding: base64
Content-Type: image/jpeg; charset=binary;
name="sigimg0";
Content-Disposition: inline;
filename="sigimg0";
/9j/4AAQSkZJRgABAQAAAQABAAD//gA+Q1JFQVRPUjogZ2QtanBlZyB2MS4wICh1c2luZyBJSkcg
jt0JaKhjm3xq23GR60UuZBZn/9k=
--=_7c18e0b95b772890a22ed6c0f810a434
Content-Transfer-Encoding: base64
Content-Type: image/jpeg; charset=binary;
name="sigimg1";
Content-Disposition: inline;
filename="sigimg1";
/9j/4AAQSkZJRgABAQAAAQABAAD//gA+Q1JFQVRPUjogZ2QtanBlZyB2MS4wICh1c2luZyBJSkcg
SlBFRyB2NjIpLCBkZWZhdWx0IHF1YWxpdHkK/9sAQwAIBgYHBgUIBwcHCQkICgwUDQwLCwwZEhMP
ooooA//Z
--=_7c18e0b95b772890a22ed6c0f810a434--
"""
raw_email_encoded_bad_multipart = b"""Delivered-To: receiver@example.com
Return-Path: <sender@example.com>
From: sender@example.com
To: "Receiver" <receiver@example.com>, "Second\r\n Receiver" <recipient@example.com>
Subject: Re: Looking to connect with you...
Date: Thu, 20 Apr 2017 15:32:52 +0000
Message-ID: <BN6PR16MB179579288933D60C4016D078C31B0@BN6PR16MB1795.namprd16.prod.outlook.com>
Content-Type: multipart/related;
boundary="_004_BN6PR16MB179579288933D60C4016D078C31B0BN6PR16MB1795namp_";
type="multipart/alternative"
MIME-Version: 1.0
--_004_BN6PR16MB179579288933D60C4016D078C31B0BN6PR16MB1795namp_
Content-Type: multipart/alternative;
boundary="_000_BN6PR16MB179579288933D60C4016D078C31B0BN6PR16MB1795namp_"
--_000_BN6PR16MB179579288933D60C4016D078C31B0BN6PR16MB1795namp_
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
SGkgRGFuaWVsbGUsDQoNCg0KSSBhY3R1YWxseSBhbSBoYXBweSBpbiBteSBjdXJyZW50IHJvbGUs
Y3J1aXRlciB8IENoYXJsb3R0ZSwgTkMNClNlbnQgdmlhIEhhcHBpZQ0KDQoNCg==
--_000_BN6PR16MB179579288933D60C4016D078C31B0BN6PR16MB1795namp_
Content-Type: text/html; charset="utf-8"
Content-Transfer-Encoding: base64
PGh0bWw+DQo8aGVhZD4NCjxtZXRhIGh0dHAtZXF1aXY9IkNvbnRlbnQtVHlwZSIgY29udGVudD0i
CjwvZGl2Pg0KPC9kaXY+DQo8L2JvZHk+DQo8L2h0bWw+DQo=
--_000_BN6PR16MB179579288933D60C4016D078C31B0BN6PR16MB1795namp_--
--_004_BN6PR16MB179579288933D60C4016D078C31B0BN6PR16MB1795namp_
Content-Type: image/png; name="=?utf-8?B?T3V0bG9va0Vtb2ppLfCfmIoucG5n?="
Content-Description: =?utf-8?B?T3V0bG9va0Vtb2ppLfCfmIoucG5n?=
Content-Disposition: inline;
filename="=?utf-8?B?T3V0bG9va0Vtb2ppLfCfmIoucG5n?="; size=488;
creation-date="Thu, 20 Apr 2017 15:32:52 GMT";
modification-date="Thu, 20 Apr 2017 15:32:52 GMT"
Content-ID: <254962e2-f05c-40d1-aa11-0d34671b056c>
Content-Transfer-Encoding: base64
iVBORw0KGgoAAAANSUhEUgAAABMAAAATCAYAAAByUDbMAAAAGXRFWHRTb2Z0d2FyZQBBZG9iZSBJ
cvED9AIR3TCAAAMAqh+p+YMVeBQAAAAASUVORK5CYII=
--_004_BN6PR16MB179579288933D60C4016D078C31B0BN6PR16MB1795namp_--
"""
raw_email_encoded_another_bad_multipart = b"""Delivered-To: receiver@example.com
Return-Path: <sender@example.com>
Mime-Version: 1.0
Date: Wed, 22 Mar 2017 15:21:55 -0500
Message-ID: <58D29693.192A.0075.1@wimort.com>
Subject: Re: Reaching Out About Peoples Home Equity
From: sender@example.com
To: receiver@example.com
Content-Type: multipart/alternative; boundary="____NOIBTUQXSYRVOOAFLCHY____"
--____NOIBTUQXSYRVOOAFLCHY____
Content-Type: text/plain; charset=iso-8859-15
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline;
modification-date="Wed, 22 Mar 2017 15:21:55 -0500"
Chloe,
--____NOIBTUQXSYRVOOAFLCHY____
Content-Type: multipart/related; boundary="____XTSWHCFJMONXSVGPVDLY____"
--____XTSWHCFJMONXSVGPVDLY____
Content-Type: text/html; charset=iso-8859-15
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline;
modification-date="Wed, 22 Mar 2017 15:21:55 -0500"
<HTML xmlns=3D"http://www.w3.org/1999/xhtml">
<BODY style=3D"COLOR: black; FONT: 10pt Segoe UI; MARGIN: 4px 4px 1px" =
leftMargin=3D0 topMargin=3D0 offset=3D"0" marginwidth=3D"0" marginheight=3D=
"0">
<DIV>Chloe,</DIV>
<IMG src=3D"cid:VFXVGHA=
GXNMI.36b3148cbf284ba18d35bdd8386ac266" width=3D1 height=3D1> </BODY></HTML=
>
--____XTSWHCFJMONXSVGPVDLY____
Content-ID: <TLUACRGXVUBY.IMAGE_3.gif>
Content-Type: image/gif
Content-Transfer-Encoding: base64
R0lGODlhHgHCAPf/AIOPr9GvT7SFcZZjVTEuMLS1tZKUlJN0Znp4eEA7PV1aWvz8+8V6Zl1BNYxX
HvOZ1/zmOd95agUEADs=
--____XTSWHCFJMONXSVGPVDLY____
Content-ID: <VFXVGHAGXNMI.36b3148cbf284ba18d35bdd8386ac266>
Content-Type: image/xxx
Content-Transfer-Encoding: base64
R0lGODlhAQABAPAAAAAAAAAAACH5BAEAAAAALAAAAAABAAEAAAICRAEAOw==
--____XTSWHCFJMONXSVGPVDLY____--
--____NOIBTUQXSYRVOOAFLCHY____--
"""
raw_email_with_trailing_semicolon_to_disposition_content = b"""Delivered-To: receiver@example.com
Return-Path: <sender@example.com>
Mime-Version: 1.0
Date: Wed, 22 Mar 2017 15:21:55 -0500
Message-ID: <58D29693.192A.0075.1@wimort.com>
Subject: Re: Reaching Out About Peoples Home Equity
From: sender@example.com
To: receiver@example.com
Content-Type: multipart/alternative; boundary="____NOIBTUQXSYRVOOAFLCHY____"
--____NOIBTUQXSYRVOOAFLCHY____
Content-Type: text/plain; charset=iso-8859-15
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline;
modification-date="Wed, 22 Mar 2017 15:21:55 -0500"
Hello Chloe
--____NOIBTUQXSYRVOOAFLCHY____
Content-Type: multipart/related; boundary="____XTSWHCFJMONXSVGPVDLY____"
--____XTSWHCFJMONXSVGPVDLY____
Content-Type: text/html; charset=iso-8859-15
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline;
modification-date="Wed, 22 Mar 2017 15:21:55 -0500"
<HTML xmlns=3D"http://www.w3.org/1999/xhtml">
<BODY>
<DIV>Hello Chloe</DIV>
</BODY>
</HTML>
--____XTSWHCFJMONXSVGPVDLY____
Content-Type: application/octet-stream; name="abc.xyz"
Content-Description: abc.xyz
Content-Disposition: attachment; filename="abc.xyz";
Content-Transfer-Encoding: base64
R0lGODlhHgHCAPf/AIOPr9GvT7SFcZZjVTEuMLS1tZKUlJN0Znp4eEA7PV1aWvz8+8V6Zl1BNYxX
HvOZ1/zmOd95agUEADs=
--____XTSWHCFJMONXSVGPVDLY____
Content-ID: <VFXVGHAGXNMI.36b3148cbf284ba18d35bdd8386ac266>
Content-Type: image/xxx
Content-Transfer-Encoding: base64
R0lGODlhAQABAPAAAAAAAAAAACH5BAEAAAAALAAAAAABAAEAAAICRAEAOw==
--____XTSWHCFJMONXSVGPVDLY____--
--____NOIBTUQXSYRVOOAFLCHY____--
"""
class TestParser(unittest.TestCase):
def test_parse_email(self):
@@ -96,16 +291,40 @@ class TestParser(unittest.TestCase):
self.assertEqual('Выписка по карте', parsed_email.subject)
self.assertEqual('Выписка по карте 1234', parsed_email.body['html'][0])
def test_parse_email_invalid_unicode(self):
parsed_email = parse_email(open(os.path.join(TEST_DIR, '8422.msg'), 'rb').read())
self.assertEqual("Following up Re: Looking to connect, let's schedule a call!", parsed_email.subject)
def test_parse_email_inline_body(self):
parsed_email = parse_email(raw_email_encoded_another_bad_multipart)
self.assertEqual("Re: Reaching Out About Peoples Home Equity", parsed_email.subject)
self.assertTrue(parsed_email.body['plain'])
self.assertTrue(parsed_email.body['html'])
def test_parse_email_multipart(self):
parsed_email = parse_email(raw_email_encoded_multipart)
self.assertEqual("RE: Kari, are you open to this?", parsed_email.subject)
def test_parse_email_bad_multipart(self):
parsed_email = parse_email(raw_email_encoded_bad_multipart)
self.assertEqual("Re: Looking to connect with you...", parsed_email.subject)
def test_parse_email_ignores_header_casing(self):
self.assertEqual('one', parse_email('Message-ID: one').message_id)
self.assertEqual('one', parse_email('Message-Id: one').message_id)
self.assertEqual('one', parse_email('Message-id: one').message_id)
self.assertEqual('one', parse_email('message-id: one').message_id)
# TODO - Complete the test suite
def test_parse_attachment(self):
pass
parsed_email = parse_email(raw_email_with_trailing_semicolon_to_disposition_content)
self.assertEqual(1, len(parsed_email.attachments))
attachment = parsed_email.attachments[0]
self.assertEqual('application/octet-stream', attachment['content-type'])
self.assertEqual(71, attachment['size'])
self.assertEqual('abc.xyz', attachment['filename'])
self.assertTrue(attachment['content'])
# TODO - Complete the test suite
def test_decode_mail_header(self):
pass
@@ -117,6 +336,9 @@ class TestParser(unittest.TestCase):
from_message_object = email.message_from_string("From: John Smith <johnsmith@gmail.com>")
self.assertEqual([{'email': 'johnsmith@gmail.com', 'name': 'John Smith'}], get_mail_addresses(from_message_object, 'from'))
invalid_encoding_in_from_message_object = email.message_from_string("From: =?UTF-8?Q?C=E4cilia?= <caciliahxg827m@example.org>")
self.assertEqual([{'email': 'caciliahxg827m@example.org', 'name': 'C<EFBFBD>cilia'}], get_mail_addresses(invalid_encoding_in_from_message_object, 'from'))
def test_parse_email_with_policy(self):
if not SMTP:
return

View File

@@ -15,22 +15,36 @@ class TestQuery(unittest.TestCase):
res = build_search_query(unread=True)
self.assertEqual(res, "(UNSEEN)")
def test_unflagged(self):
res = build_search_query(unflagged=True)
self.assertEqual(res, "(UNFLAGGED)")
def test_flagged(self):
res = build_search_query(flagged=True)
self.assertEqual(res, "(FLAGGED)")
def test_sent_from(self):
res = build_search_query(sent_from='test@example.com')
self.assertEqual(res, "(FROM \"test@example.com\")")
self.assertEqual(res, '(FROM "test@example.com")')
def test_sent_to(self):
res = build_search_query(sent_to='test@example.com')
self.assertEqual(res, "(TO \"test@example.com\")")
self.assertEqual(res, '(TO "test@example.com")')
def test_date__gt(self):
res = build_search_query(date__gt=date(2014, 12, 31))
self.assertEqual(res, "(SINCE \"31-Dec-2014\")")
self.assertEqual(res, '(SINCE "31-Dec-2014")')
def test_date__lt(self):
res = build_search_query(date__lt=date(2014, 1, 1))
self.assertEqual(res, "(BEFORE \"1-Jan-2014\")")
self.assertEqual(res, '(BEFORE "01-Jan-2014")')
def test_date__on(self):
res = build_search_query(date__on=date(2014, 1, 1))
self.assertEqual(res, '(ON "01-Jan-2014")')

6
tox.ini Normal file
View File

@@ -0,0 +1,6 @@
[tox]
envlist = py33,py34,py35,py36
[testenv]
deps=nose
commands=nosetests -v