{ "cells": [ { "cell_type": "markdown", "id": "b739c0e2", "metadata": {}, "source": [ "# Image Detection Demo with PytorchWildlife" ] }, { "cell_type": "markdown", "id": "1197e180", "metadata": {}, "source": [ "This tutorial guides you on how to use PyTorchWildlife for image detection. We will go through the process of setting up the environment, defining the detection model, as well as performing inference and saving the results in different ways.\n", "\n", "## Prerequisites\n", "Install PytorchWildlife running the following commands:\n", "```bash\n", "conda create -n pytorch_wildlife python=3.8 -y\n", "conda activate pytorch_wildlife\n", "pip install PytorchWildlife\n", "```\n", "Also, make sure you have a CUDA-capable GPU if you intend to run the model on a GPU. This notebook can also run on CPU.\n", "\n", "## Importing libraries\n", "First, we'll start by importing the necessary libraries and modules." ] }, { "cell_type": "code", "execution_count": null, "id": "c44e7713", "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "from PIL import Image\n", "import torch\n", "from torch.utils.data import DataLoader\n", "from PytorchWildlife.models import detection as pw_detection\n", "from PytorchWildlife.data import transforms as pw_trans\n", "from PytorchWildlife.data import datasets as pw_data \n", "from PytorchWildlife import utils as pw_utils" ] }, { "cell_type": "markdown", "id": "3a23982f", "metadata": {}, "source": [ "## Setting GPU\n", "If you are using a GPU for this exercise, please specify which GPU to use for the computations. By default, GPU number 0 is used. Adjust this as per your setup. You don't need to run this cell if you are using a CPU." ] }, { "cell_type": "code", "execution_count": 2, "id": "8f622040", "metadata": {}, "outputs": [], "source": [ "torch.cuda.set_device(0) # Only use if you are running on GPU." ] }, { "cell_type": "markdown", "id": "6abd07b5", "metadata": {}, "source": [ "## Model Initialization\n", "We will initialize the MegaDetectorV5 model for image detection. This model is designed for detecting animals in images." ] }, { "cell_type": "code", "execution_count": 4, "id": "eb25db43", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Fusing layers... \n", "Fusing layers... \n", "Model summary: 733 layers, 140054656 parameters, 0 gradients, 208.8 GFLOPs\n", "Model summary: 733 layers, 140054656 parameters, 0 gradients, 208.8 GFLOPs\n" ] } ], "source": [ "DEVICE = \"cuda\" # Use \"cuda\" if GPU is available \"cpu\" if no GPU is available\n", "detection_model = pw_detection.MegaDetectorV5(device=DEVICE, pretrained=True)" ] }, { "cell_type": "markdown", "id": "1e57dcca", "metadata": {}, "source": [ "## Single Image Detection\n", "We will first perform detection on a single image. Make sure to verify that you have the image in the specified path." ] }, { "cell_type": "code", "execution_count": 5, "id": "0d730b20", "metadata": {}, "outputs": [], "source": [ "tgt_img_path = \"./demo_data/imgs/10050028_0.JPG\"\n", "img = np.array(Image.open(tgt_img_path).convert(\"RGB\"))\n", "transform = pw_trans.MegaDetector_v5_Transform(target_size=detection_model.IMAGE_SIZE,\n", " stride=detection_model.STRIDE)\n", "results = detection_model.single_image_detection(transform(img), img.shape, tgt_img_path)\n", "pw_utils.save_detection_images(results, \"./demo_output\")" ] }, { "cell_type": "markdown", "id": "2e23329c", "metadata": {}, "source": [ "## Batch Image Detection\n", "Next, we'll demonstrate how to process multiple images in batches. This is useful when you have a large number of images and want to process them efficiently." ] }, { "cell_type": "code", "execution_count": null, "id": "561eff0c", "metadata": {}, "outputs": [], "source": [ "tgt_folder_path = \"./demo_data/imgs\"\n", "dataset = pw_data.DetectionImageFolder(\n", " tgt_folder_path,\n", " transform=pw_trans.MegaDetector_v5_Transform(target_size=detection_model.IMAGE_SIZE,\n", " stride=detection_model.STRIDE)\n", ")\n", "loader = DataLoader(dataset, batch_size=32, shuffle=False, \n", " pin_memory=True, num_workers=8, drop_last=False)\n", "results = detection_model.batch_image_detection(loader)" ] }, { "cell_type": "markdown", "id": "bb41830b", "metadata": {}, "source": [ "## Output Results\n", "PytorchWildlife allows to output detection results in multiple formats. Here are the examples:\n", "\n", "### 1. Annotated Images:\n", "This will output the images with bounding boxes drawn around the detected animals. The images will be saved in the specified output directory." ] }, { "cell_type": "code", "execution_count": 7, "id": "f63310ab", "metadata": {}, "outputs": [], "source": [ "pw_utils.save_detection_images(results, \"batch_output\")" ] }, { "cell_type": "markdown", "id": "bbd016ae", "metadata": {}, "source": [ "### 2. Cropped Images:\n", "This will output the cropped images of the detected animals. The cropping is done around the detection bounding box, The images will be saved in the specified output directory." ] }, { "cell_type": "code", "execution_count": 8, "id": "13653739", "metadata": {}, "outputs": [], "source": [ "pw_utils.save_crop_images(results, \"crop_output\")" ] }, { "cell_type": "markdown", "id": "49be4063", "metadata": {}, "source": [ "### 3. JSON Format:\n", "This will output the detection results in JSON format. The JSON file will be saved in the specified output directory." ] }, { "cell_type": "code", "execution_count": 9, "id": "b627280b", "metadata": {}, "outputs": [], "source": [ "pw_utils.save_detection_json(results, \"./batch_output.json\",\n", " categories=detection_model.CLASS_NAMES)" ] }, { "cell_type": "markdown", "id": "a4ee1d7b", "metadata": {}, "source": [ "### Copyright (c) Microsoft Corporation. All rights reserved.\n", "### Licensed under the MIT License." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.18" } }, "nbformat": 4, "nbformat_minor": 5 }