Python 数字移动设备取证

本章将解释移动设备上的Python数字取证以及所涉及的概念。

介绍

移动设备取证是数字取证的一个分支，它处理移动设备的获取和分析以恢复调查兴趣的数字证据。该分支与计算机取证不同，因为移动设备具有内置通信系统，可用于提供与位置相关的有用信息。

尽管智能手机在数字取证中的使用日益增加，但由于其异构性，它仍然被认为是非标准的。另一方面，计算机硬件，例如硬盘，也被认为是标准的，并且也作为一门稳定的学科而发展。在数字取证行业中，对于具有短暂证据的非标准设备（例如智能手机）所使用的技术存在很多争论。

可从移动设备中提取的工件

与仅具有通话记录或短信的旧手机相比，现代移动设备拥有大量数字信息。因此，移动设备可以为调查人员提供有关其用户的大量见解。可以从移动设备中提取的一些工件如下所述 -

消息- 这些是有用的文物，可以揭示所有者的精神状态，甚至可以向调查员提供一些以前未知的信息。
位置历史记录- 位置历史记录数据是一个有用的工件，调查人员可以使用它来验证一个人的特定位置。
安装的应用程序- 通过访问安装的应用程序类型，调查人员可以深入了解移动用户的习惯和思维。

Python 中的证据来源和处理

智能手机以 SQLite 数据库和 PLIST 文件作为主要证据来源。在本节中，我们将使用 python 处理证据来源。

分析 PLIST 文件

PLIST（属性列表）是一种灵活且方便的格式，用于存储应用程序数据，尤其是在 iPhone 设备上。它使用扩展名.plist。此类文件用于存储有关捆绑包和应用程序的信息。它可以采用两种格式：XML和二进制。以下 Python 代码将打开并读取 PLIST 文件。请注意，在继续之前，我们必须创建自己的Info.plist文件。

首先，通过以下命令安装名为biplist的第三方库-

Pip install biplist

现在，导入一些有用的库来处理 plist 文件 -

import biplist
import os
import sys

现在，在 main 方法下使用以下命令可将 plist 文件读入变量 -

def main(plist):
   try:
      data = biplist.readPlist(plist)
   except (biplist.InvalidPlistException,biplist.NotBinaryPlistException) as e:
print("[-] Invalid PLIST file - unable to be opened by biplist")
sys.exit(1)

现在，我们可以从这个变量读取控制台上的数据或直接打印它。

SQLite数据库

SQLite 充当移动设备上的主要数据存储库。SQLite 是一个进程内库，它实现了独立、无服务器、零配置、事务性 SQL 数据库引擎。它是一个零配置的数据库，与其他数据库不同，您不需要在系统中配置它。

如果您是新手或不熟悉 SQLite 数据库，可以点击链接www.tutorialspoint.com/sqlite/index.htm另外，如果您想要，可以点击链接www.tutorialspoint.com/sqlite/sqlite_python.htm详细了解 SQLite 和 Python。

在移动取证过程中，我们可以与移动设备的sms.db文件进行交互，并可以从消息表中提取有价值的信息。Python 有一个名为sqlite3的内置库，用于连接 SQLite 数据库。您可以使用以下命令导入相同的内容 -

import sqlite3

现在，借助以下命令，我们可以连接数据库，例如移动设备的sms.db -

Conn = sqlite3.connect(‘sms.db’)
C = conn.cursor()

这里，C是游标对象，借助它我们可以与数据库进行交互。

现在，假设如果我们想执行一个特定的命令，比如从abc 表中获取详细信息，可以借助以下命令来完成 -

c.execute(“Select * from abc”)
c.close()

上述命令的结果将存储在游标对象中。类似地，我们可以使用fetchall()方法将结果转储到我们可以操作的变量中。

我们可以使用以下命令获取sms.db中消息表的列名数据-

c.execute(“pragma table_info(message)”)
table_data = c.fetchall()
columns = [x[1] for x in table_data

请注意，这里我们使用 SQLite PRAGMA 命令，这是一个特殊命令，用于控制 SQLite 环境中的各种环境变量和状态标志。在上面的命令中，fetchall()方法返回一个结果元组。每列的名称存储在每个元组的第一个索引中。

现在，借助以下命令，我们可以查询表中的所有数据并将其存储在名为data_msg的变量中-

c.execute(“Select * from message”)
data_msg = c.fetchall()

上面的命令会将数据存储在变量中，此外我们还可以使用csv.writer()方法将上述数据写入 CSV 文件中。

iTunes 备份

iPhone 移动取证可以对 iTunes 所做的备份执行。取证检查人员依赖于分析通过 iTunes 获取的 iPhone 逻辑备份。iTunes 使用 AFC（Apple 文件连接）协议进行备份。此外，除了托管密钥记录之外，备份过程不会修改 iPhone 上的任何内容。

现在，问题来了：为什么数字取证专家了解 iTunes 备份技术很重要？这一点很重要，以防我们直接访问嫌疑人的电脑而不是 iPhone，因为当使用电脑与 iPhone 同步时，iPhone 上的大部分信息很可能会备份到电脑上。

备份过程及其位置

每当 Apple 产品备份到计算机时，它都会与 iTunes 同步，并且会出现一个带有设备唯一 ID 的特定文件夹。在最新的备份格式中，文件存储在包含文件名前两个十六进制字符的子文件夹中。在这些备份文件中，有一些文件（例如 info.plist）与名为 Manifest.db 的数据库一起很有用。下表显示了备份位置，随 iTunes 备份操作系统的不同而变化 -

操作系统	备份位置
WIN7	C:\Users\[用户名]\AppData\Roaming\AppleComputer\MobileSync\Backup\
操作系统	〜/库/应用程序支持/MobileSync/备份/

为了使用 Python 处理 iTunes 备份，我们需要首先根据操作系统识别备份位置中的所有备份。然后我们将迭代每个备份并读取数据库Manifest.db。

现在，借助以下 Python 代码，我们可以做同样的事情 -

首先，导入必要的库，如下所示 -

from __future__ import print_function
import argparse
import logging
import os

from shutil import copyfile
import sqlite3
import sys
logger = logging.getLogger(__name__)

现在，提供两个位置参数，即 INPUT_DIR 和 OUTPUT_DIR，它们代表 iTunes 备份和所需的输出文件夹 -

if __name__ == "__main__":
   parser.add_argument("INPUT_DIR",help = "Location of folder containing iOS backups, ""e.g. ~\Library\Application Support\MobileSync\Backup folder")
   parser.add_argument("OUTPUT_DIR", help = "Output Directory")
   parser.add_argument("-l", help = "Log file path",default = __file__[:-2] + "log")
   parser.add_argument("-v", help = "Increase verbosity",action = "store_true") args = parser.parse_args()

现在，设置日志如下 -

if args.v:
   logger.setLevel(logging.DEBUG)
else:
   logger.setLevel(logging.INFO)

现在，设置该日志的消息格式如下 -

msg_fmt = logging.Formatter("%(asctime)-15s %(funcName)-13s""%(levelname)-8s %(message)s")
strhndl = logging.StreamHandler(sys.stderr)
strhndl.setFormatter(fmt = msg_fmt)

fhndl = logging.FileHandler(args.l, mode = 'a')
fhndl.setFormatter(fmt = msg_fmt)

logger.addHandler(strhndl)
logger.addHandler(fhndl)
logger.info("Starting iBackup Visualizer")
logger.debug("Supplied arguments: {}".format(" ".join(sys.argv[1:])))
logger.debug("System: " + sys.platform)
logger.debug("Python Version: " + sys.version)

以下代码行将使用os.makedirs()函数为所需的输出目录创建必要的文件夹 -

if not os.path.exists(args.OUTPUT_DIR):
   os.makedirs(args.OUTPUT_DIR)

现在，将提供的输入和输出目录传递给 main() 函数，如下所示 -

if os.path.exists(args.INPUT_DIR) and os.path.isdir(args.INPUT_DIR):
   main(args.INPUT_DIR, args.OUTPUT_DIR)
else:
   logger.error("Supplied input directory does not exist or is not ""a directory")
   sys.exit(1)

现在，编写main()函数，该函数将进一步调用backup_summary()函数来识别输入文件夹中存在的所有备份 -

def main(in_dir, out_dir):
   backups = backup_summary(in_dir)
def backup_summary(in_dir):
   logger.info("Identifying all iOS backups in {}".format(in_dir))
   root = os.listdir(in_dir)
   backups = {}
   
   for x in root:
      temp_dir = os.path.join(in_dir, x)
      if os.path.isdir(temp_dir) and len(x) == 40:
         num_files = 0
         size = 0
         
         for root, subdir, files in os.walk(temp_dir):
            num_files += len(files)
            size += sum(os.path.getsize(os.path.join(root, name))
               for name in files)
         backups[x] = [temp_dir, num_files, size]
   return backups

现在，将每个备份的摘要打印到控制台，如下所示 -

print("Backup Summary")
print("=" * 20)

if len(backups) > 0:
   for i, b in enumerate(backups):
      print("Backup No.: {} \n""Backup Dev. Name: {} \n""# Files: {} \n""Backup Size (Bytes): {}\n".format(i, b, backups[b][1], backups[b][2]))

现在，将 Manifest.db 文件的内容转储到名为 db_items 的变量中。

try:
   db_items = process_manifest(backups[b][0])
   except IOError:
      logger.warn("Non-iOS 10 backup encountered or " "invalid backup. Continuing to next backup.")
continue

现在，让我们定义一个函数来获取备份的目录路径 -

def process_manifest(backup):
   manifest = os.path.join(backup, "Manifest.db")
   
   if not os.path.exists(manifest):
      logger.error("Manifest DB not found in {}".format(manifest))
      raise IOError

现在，使用 SQLite3，我们将通过名为 c 的游标连接到数据库 -

c = conn.cursor()
items = {}

for row in c.execute("SELECT * from Files;"):
   items[row[0]] = [row[2], row[1], row[3]]
return items

create_files(in_dir, out_dir, b, db_items)
   print("=" * 20)
else:
   logger.warning("No valid backups found. The input directory should be
      " "the parent-directory immediately above the SHA-1 hash " "iOS device backups")
      sys.exit(2)

现在，定义create_files()方法如下 -

def create_files(in_dir, out_dir, b, db_items):
   msg = "Copying Files for backup {} to {}".format(b, os.path.join(out_dir, b))
   logger.info(msg)

现在，迭代db_items字典中的每个键 -

for x, key in enumerate(db_items):
   if db_items[key][0] is None or db_items[key][0] == "":
      continue
   else:
      dirpath = os.path.join(out_dir, b,
os.path.dirname(db_items[key][0]))
   filepath = os.path.join(out_dir, b, db_items[key][0])
   
   if not os.path.exists(dirpath):
      os.makedirs(dirpath)
      original_dir = b + "/" + key[0:2] + "/" + key
   path = os.path.join(in_dir, original_dir)
   
   if os.path.exists(filepath):
      filepath = filepath + "_{}".format(x)

现在，使用Shutil.copyfile()方法复制备份文件，如下所示 -

try:
   copyfile(path, filepath)
   except IOError:
      logger.debug("File not found in backup: {}".format(path))
         files_not_found += 1
   if files_not_found > 0:
      logger.warning("{} files listed in the Manifest.db not" "found in
backup".format(files_not_found))
   copyfile(os.path.join(in_dir, b, "Info.plist"), os.path.join(out_dir, b,
"Info.plist"))
   copyfile(os.path.join(in_dir, b, "Manifest.db"), os.path.join(out_dir, b,
"Manifest.db"))
   copyfile(os.path.join(in_dir, b, "Manifest.plist"), os.path.join(out_dir, b,
"Manifest.plist"))
   copyfile(os.path.join(in_dir, b, "Status.plist"),os.path.join(out_dir, b,
"Status.plist"))

使用上面的 Python 脚本，我们可以在输出文件夹中获取更新的备份文件结构。我们可以使用pycrypto python 库来解密备份。

无线上网

移动设备可通过随处可见的 Wi-Fi 网络连接到外部世界。有时设备会自动连接到这些开放网络。

对于 iPhone，设备已连接的开放 Wi-Fi 连接列表存储在名为com.apple.wifi.plist的 PLIST 文件中。该文件将包含 Wi-Fi SSID、BSSID 和连接时间。

我们需要使用 Python 从标准 Cellebrite XML 报告中提取 Wi-Fi 详细信息。为此，我们需要使用无线地理记录引擎 (WIGLE) 的 API，这是一个流行的平台，可用于使用 Wi-Fi 网络名称查找设备的位置。

我们可以使用名为requests的 Python 库来访问 WIGLE 的 API。它可以按如下方式安装 -

pip install requests

来自 WIGLE 的 API

我们需要在 WIGLE 网站https://wigle.net/account上注册才能获得 WIGLE 的免费 API。下面讨论用于通过 WIGEL 的 API 获取有关用户设备及其连接的信息的 Python 脚本 -

首先，导入以下库来处理不同的事情 -

from __future__ import print_function

import argparse
import csv
import os
import sys
import xml.etree.ElementTree as ET
import requests

现在，提供两个位置参数，即INPUT_FILE和OUTPUT_CSV，它们分别代表具有 Wi-Fi MAC 地址的输入文件和所需的输出 CSV 文件 -

if __name__ == "__main__":
   parser.add_argument("INPUT_FILE", help = "INPUT FILE with MAC Addresses")
   parser.add_argument("OUTPUT_CSV", help = "Output CSV File")
   parser.add_argument("-t", help = "Input type: Cellebrite XML report or TXT
file",choices = ('xml', 'txt'), default = "xml")
   parser.add_argument('--api', help = "Path to API key
   file",default = os.path.expanduser("~/.wigle_api"),
   type = argparse.FileType('r'))
   args = parser.parse_args()

现在以下代码行将检查输入文件是否存在并且是一个文件。如果没有，则退出脚本 -

if not os.path.exists(args.INPUT_FILE) or \ not os.path.isfile(args.INPUT_FILE):
   print("[-] {} does not exist or is not a
file".format(args.INPUT_FILE))
   sys.exit(1)
directory = os.path.dirname(args.OUTPUT_CSV)
if directory != '' and not os.path.exists(directory):
   os.makedirs(directory)
api_key = args.api.readline().strip().split(":")

现在，将参数传递给 main，如下所示 -

main(args.INPUT_FILE, args.OUTPUT_CSV, args.t, api_key)
def main(in_file, out_csv, type, api_key):
   if type == 'xml':
      wifi = parse_xml(in_file)
   else:
      wifi = parse_txt(in_file)
query_wigle(wifi, out_csv, api_key)

现在，我们将解析 XML 文件如下 -

def parse_xml(xml_file):
   wifi = {}
   xmlns = "{http://pa.cellebrite.com/report/2.0}"
   print("[+] Opening {} report".format(xml_file))
   
   xml_tree = ET.parse(xml_file)
   print("[+] Parsing report for all connected WiFi addresses")
   
   root = xml_tree.getroot()

现在，迭代根的子元素，如下所示 -

for child in root.iter():
   if child.tag == xmlns + "model":
      if child.get("type") == "Location":
         for field in child.findall(xmlns + "field"):
            if field.get("name") == "TimeStamp":
               ts_value = field.find(xmlns + "value")
               try:
               ts = ts_value.text
               except AttributeError:
continue

现在，我们将检查值的文本中是否存在“ssid”字符串 -

if "SSID" in value.text:
   bssid, ssid = value.text.split("\t")
   bssid = bssid[7:]
   ssid = ssid[6:]

现在，我们需要将 BSSID、SSID 和时间戳添加到 wifi 字典中，如下所示 -

if bssid in wifi.keys():

wifi[bssid]["Timestamps"].append(ts)
   wifi[bssid]["SSID"].append(ssid)
else:
   wifi[bssid] = {"Timestamps": [ts], "SSID":
[ssid],"Wigle": {}}
return wifi

文本解析器比 XML 解析器简单得多，如下所示 -

def parse_txt(txt_file):
   wifi = {}
   print("[+] Extracting MAC addresses from {}".format(txt_file))
   
   with open(txt_file) as mac_file:
      for line in mac_file:
         wifi[line.strip()] = {"Timestamps": ["N/A"], "SSID":
["N/A"],"Wigle": {}}
return wifi

现在，让我们使用 requests 模块进行WIGLE API调用，并需要继续执行query_wigle()方法 -

def query_wigle(wifi_dictionary, out_csv, api_key):
   print("[+] Querying Wigle.net through Python API for {} "
"APs".format(len(wifi_dictionary)))
   for mac in wifi_dictionary:

   wigle_results = query_mac_addr(mac, api_key)
def query_mac_addr(mac_addr, api_key):

   query_url = "https://api.wigle.net/api/v2/network/search?" \
"onlymine = false&freenet = false&paynet = false" \ "&netid = {}".format(mac_addr)
   req = requests.get(query_url, auth = (api_key[0], api_key[1]))
   return req.json()

实际上，每天 WIGLE API 调用有一个限制，如果超过该限制，则必须显示错误，如下所示 -

try:
   if wigle_results["resultCount"] == 0:
      wifi_dictionary[mac]["Wigle"]["results"] = []
         continue
   else:
      wifi_dictionary[mac]["Wigle"] = wigle_results
except KeyError:
   if wigle_results["error"] == "too many queries today":
      print("[-] Wigle daily query limit exceeded")
      wifi_dictionary[mac]["Wigle"]["results"] = []
      continue
   else:
      print("[-] Other error encountered for " "address {}: {}".format(mac,
wigle_results['error']))
   wifi_dictionary[mac]["Wigle"]["results"] = []
   continue
prep_output(out_csv, wifi_dictionary)

现在，我们将使用prep_output()方法将字典扁平化为易于写入的块 -

def prep_output(output, data):
   csv_data = {}
   google_map = https://www.google.com/maps/search/

现在，访问我们迄今为止收集的所有数据，如下所示 -

for x, mac in enumerate(data):
   for y, ts in enumerate(data[mac]["Timestamps"]):
      for z, result in enumerate(data[mac]["Wigle"]["results"]):
         shortres = data[mac]["Wigle"]["results"][z]
         g_map_url = "{}{},{}".format(google_map, shortres["trilat"],shortres["trilong"])

现在，我们可以使用write_csv()函数将输出写入 CSV 文件，就像我们在本章前面的脚本中所做的那样。