Keyword search : PST (mail with 7z attached file)

mandelbrot · April 28, 2020, 2:18pm

Hi all,

 consider this SCENARIO 1 :

Embedded File Extraction Module
Keyword Search
PST-mail (pst file has only one mail with only one ZIP attached file)

ZIPPED

Everything works fine on Scenario N.1 :

Results - Keywors Hits  = 1 
Indexed Text is "Intelligible"

Consider this SCENARIO 2 :

Embedded File Extraction Module
Keyword Search
PST-mail (pst file has only one mail with only one 7Z attached file)

In Scenario N.2 :

Results - Keywors Hits  = 0 (searching the same "Scenario 1 keyword")
Indexed Text is NOT "Intelligible"

So my question :

Are attached 7z files (zipped with LZMA or BZip2 compression method) supported in Embedded File Extraction Module + Keyword Search on PST files?

Thanks in advance for your suggestions

Luca

apriestman · April 28, 2020, 2:36pm

I made two 7z files with bzip2 and lzma and they seemed to work.

For trying to figure out where the issue is, I’d suggest trying something simpler than going through the email parser then embedded file extractor and then keyword search. You can right click and extract your .7z files to disk and then make a new case adding them as a logical data source, and then you can just run the embedded file extractor and see if any files got extracted instead of running keyword search.

mandelbrot · April 28, 2020, 4:09pm

I’ve made a new case adding 7z files as a logical data source.
Then i’ve run the embedded file extractor and all files got correctly extracted.

Instead, processing (keyword+embedded file extractor) a Outlook-pst file having a mail with same 7z attached fails …

Thanks in advance for your support.

apriestman · April 28, 2020, 4:56pm

Any chance you can share your pst file? Send me a PM if it’s possible.

apriestman · April 28, 2020, 5:59pm

Meanwhile, I’m confused about what I’m seeing in your second screenshot. If the email parser found the attachment I believe it should show up in its compressed form as a child of the .pst file, like these gifs:

But I don’t see any children under your SEVENZIP.pst, so it doesn’t seem like there was even anything for the embedded file extractor to run on. This would suggest it’s a problem with the email parser. What did it look like in your working case? (I can’t seem to make your first screenshot larger)

apriestman · April 29, 2020, 11:23am

Thank your for sharing your .pst files. They both work for me. I’m using Autopsy 4.14.0 on Windows 10. Here’s what I did:

Added both .pst files as a logical file set:

Ran embedded file extractor, email parser, and keyword search (and hash lookup to verify that the files were different):

In the tree, I can see the .7z files extracted by the email parser under each of the .pst files. If I click on them, I can then look at the pdf extracted by the embedded file extractor module. I can see “zanzara” in the indexed text in both, and doing a keyword search for it does work.

Can you try again doing that exact procedure? If it doesn’t work, see if there’s anything in the log (go to Help->Open log folder to find the logs)

mandelbrot · April 29, 2020, 12:13pm

Hi Ann,

 I'm using ONLY embedded file extractor + keyword search (both flagged in configure ingest module window). No e-mail parser.

Suppose for a moment that email-parser ingestion doesn’t exist.

The combo “embedded file extractor and keyword search” work always perfectly and always intercept all my keywords except with 7z file attached in mails…

Do you “replicate” using only embedded file extractor + keyword search ingestion process ?

Luca

apriestman · April 29, 2020, 12:59pm

I wouldn’t expect it to work without email parser. The email parser pulls out the 7zip file - Autopsy wouldn’t know about it otherwise. Then the 7zip file is decompressed by the embedded file extractor. Without email parser you can still run keyword search on the original pst file but it’s just going to see the compressed data so you probably won’t see anything.

Is there some reason you don’t want to run email parser?

apriestman · April 29, 2020, 1:18pm

Note that the embedded file extractor only runs on archives and documents, not .pst files. So it’s not going to extract the archives from the .pst, but it will extract files from the archive attachment extracted from the email parser module.

http://sleuthkit.org/autopsy/docs/user-docs/4.15.0/embedded_file_extractor_page.html

mandelbrot · April 29, 2020, 1:18pm

If file are zipped with zip extension it works … and keyword is “intercepted”

Check first video.mp4 part.

mandelbrot · April 29, 2020, 1:55pm

Now you can find also in Dropbox :

https://www.dropbox.com/sh/dxor8wz8owa3dqv/AAC5FRXyZRzw9jHVblhdCc8sa?dl=0

a file named ZIPPED_ANN.pst : mail is the same, file zipped is the same pdf file, but pdf file is “zip compressed” NOT “7z compressed”.

Now Scenario A : create a case with only ZIPPED_ANN.pst and run embedded file extractor, email parser, and keyword search.

Now SCENARIO B : create a case with only SEVENZIP_LZMA.pst and run embedded file extractor, email parser.

Autopsy behaviour is different , keyword hits are different, and so on …

Can you explain me why ?

apriestman · April 29, 2020, 2:17pm

If I run only keyword search on ZIPPED_ANN.pst, I do indeed see nice indexed text. So it has nothing to do with our embedded file extractor. My guess is that Solr/Tika (which is what we use for keyword search indexing) can do some basic parsing of .pst files and decompression, but probably doesn’t support LZMA.

mandelbrot · April 29, 2020, 4:23pm

Thanks Ann.

My feeling is that there is some overlap between the parsing and decompression of Solr / Tika (used by keyword search indexing) and Autopsy e-mail-parser ingest process.

This overlap can undeniably lead to some confusion and double result indexing …

I really appreciate your suggestion/support

Best regards.
Luca

Topic		Replies	Views
Keyword Lists Search Autopsy Help	1	419	November 7, 2023
Find particular email message & extract when a keyword hits a PST/OST file data Autopsy Help	1	1042	September 23, 2019
Extract MSG files with keyword hits Autopsy Feature Requests	1	1202	November 17, 2019
Email extraction from pst file Autopsy Help	4	842	November 26, 2022
7z (LZMA2 compression method) support : Embedded File Extraction Module Autopsy Help	1	614	October 15, 2019

Keyword search : PST (mail with 7z attached file)

Related topics