0001_fastai_Is it a bird? Creating a model from your own data

Useful Course sites

Official course site: for lesson 1

Official notebooks repo, on nbviewer

Official Is it a bird notebook on kaggle

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload

How to use autoreload

This documentation has a helpful example.

put these two lines in the top of this notebook

%load_ext autoreload
%autoreload 2

so that, when I updated fastdebug library, I don’t need to rerun import fastdebug.utils .... and it should reload the library for me automatically.

How to install and update libraries

!mamba update -q -y fastai
!pip install -Uqq duckduckgo_search

Know a little about the libraries

from fastdebug.utils import *
from fastdebug.core import *

what is fastai

import fastai
fastai: 2.7.9 
fastai simplifies training fast and accurate neural nets using modern best practices    
Jeremy Howard, Sylvain Gugger, and contributors 
python_version: >=3.7     
whatinside(fastai, lib=True)
The library has 24 modules
import fastai.losses as fl
whatinside(fl, dun=True)
fastai.losses has: 
11 items in its __all__, and 
334 user defined functions, 
178 classes or class objects, 
4 builtin funcs and methods, and
535 callables.

BaseLoss:                           class, type    Same as `loss_cls`, but flattens input and target.
CrossEntropyLossFlat:               class, type    Same as `nn.CrossEntropyLoss`, but flattens input and target.
FocalLoss:                          class, PrePostInitMeta    Same as `nn.Module`, but no need for subclasses to call `super().__init__`
FocalLossFlat:                      class, type    Same as CrossEntropyLossFlat but with focal paramter, `gamma`. Focal loss is introduced by Lin et al. 
https://arxiv.org/pdf/1708.02002.pdf. Note the class weighting factor in the paper, alpha, can be 
implemented through pytorch `weight` argument passed through to F.cross_entropy.
BCEWithLogitsLossFlat:              class, type    Same as `nn.BCEWithLogitsLoss`, but flattens input and target.
BCELossFlat:                        function    Same as `nn.BCELoss`, but flattens input and target.
MSELossFlat:                        function    Same as `nn.MSELoss`, but flattens input and target.
L1LossFlat:                         function    Same as `nn.L1Loss`, but flattens input and target.
LabelSmoothingCrossEntropy:         class, PrePostInitMeta    Same as `nn.Module`, but no need for subclasses to call `super().__init__`
LabelSmoothingCrossEntropyFlat:     class, type    Same as `LabelSmoothingCrossEntropy`, but flattens input and target.
DiceLoss:                           class, type    Dice loss for segmentation

what is duckduckgo

import duckduckgo_search
duckduckgo-search: 2.1.3 
Search for words, documents, images, news, maps and text translation using the DuckDuckGo.com search engine.    
python_version: >=3.7     
duckduckgo_search has: 
0 items in its __all__, and 
6 user defined functions, 
0 classes or class objects, 
0 builtin funcs and methods, and
6 callables.
whatinside(duckduckgo_search, func=True)
duckduckgo_search has: 
0 items in its __all__, and 
6 user defined functions, 
0 classes or class objects, 
0 builtin funcs and methods, and
6 callables.

The user defined functions are:
ddg:               function    (keywords, region='wt-wt', safesearch='Moderate', time=None, max_results=25, output=None)
ddg_images:        function    (keywords, region='wt-wt', safesearch='Moderate', time=None, size=None, color=None, type_image=None, layout=None, license_image=None, max_results=100, output=None, download=False)
ddg_maps:          function    (keywords, place=None, street=None, city=None, county=None, state=None, country=None, postalcode=None, latitude=None, longitude=None, radius=0, max_results=None, output=None)
ddg_news:          function    (keywords, region='wt-wt', safesearch='Moderate', time=None, max_results=25, output=None)
ddg_translate:     function    (keywords, from_=None, to='en', output=None)
ddg_videos:        function    (keywords, region='wt-wt', safesearch='Moderate', time=None, resolution=None, duration=None, license_videos=None, max_results=50, output=None)

How to use fastdebug with fastai notebooks

how to use fastdebug

from fastdebug.utils import *
from fastdebug.core import *
import fastdebug.utils as fu
import fastdebug.core as core
fastdebug.utils has: 
29 items in its __all__, and 
42 user defined functions, 
3 classes or class objects, 
0 builtin funcs and methods, and
45 callables.

test_eq:           function    `test` that `a==b`
test_is:           function    `test` that `a is b`
FunctionType:      class, type    Create a function object.

  a code object
  the globals dictionary
  a string that overrides the name from the code object
  a tuple that specifies the default argument values
  a tuple that supplies the bindings for free variables
MethodType:        class, type    method(function, instance)

Create a bound instance method object.
nb_url:            function    run this func to get nb_url of this current notebook
nb_path:           function    run this func to get nb_path of this current notebook
nb_name:           function    run this func to get nb_path of this current notebook
ipy2md:            function    convert the current notebook to md
expandcell:        function    expand cells of the current notebook to its full width
inspect_class:     function    examine the details of a class
ismetaclass:       function    check whether a class is a metaclass or not
isdecorator:       decorator, function    check whether a function is a decorator
whatinside:        function    Check what inside a module: `__all__`, functions, classes, builtins, and callables
whichversion:      function    Give you library version and other basic info.
fastview:          function    to view the commented src code in color print and with examples
fastsrcs:          function    to list all commented src files
getrootport:       function    get the local port and notebook dir
jn_link:           function    Get a link to the notebook at `path` on Jupyter Notebook
get_all_nbs:       function    return paths for all nbs both in md and ipynb format into lists
openNB:            function    Get a link to the notebook at by searching keyword or notebook name
highlight:         function    highlight a string with yellow background
display_md:        function    Get a link to the notebook at `path` on Jupyter Notebook
display_block:     function    `line` is a section title, find all subsequent lines which belongs to the same section and display them together
fastnbs:           function    check with fastlistnbs() to find interesting things to search fastnbs() can use keywords to search learning points (a section title and a section itself) from my documented fastai notebooks
fastcodes:         function    using keywords to search learning points from commented sources files
fastnotes:         function    using key words to search notes and display the found line and lines surround it
fastlistnbs:       function    display all my commented notebooks subheadings in a long list. Best to work with fastnbs together.
fastlistsrcs:      function    display all my commented src codes learning comments in a long list
whatinside(core, dun=True)
fastdebug.core has: 
14 items in its __all__, and 
117 user defined functions, 
18 classes or class objects, 
1 builtin funcs and methods, and
138 callables.

pprint:                       function    Pretty-print a Python object to a stream [default is sys.stdout].
dbcolors:                     class, type    None
randomColor:                  function    create a random color by return a random dbcolor from dbcolors
colorize:                     function    return the string with dbcolors
strip_ansi:                   function    to make printright work using regex
printright:                   function    print a block of text to the right of the cell
printsrclinewithidx:          function    add idx number to a srcline
printsrc:                     function    print the seleted srcline with comment, idx and specified num of expanding srclines
dbprintinsert:                function    insert arbitary code expressions into source code for evaluation
Fastdb:                       class, type    None
randomize_cmtparts_color:     function    give each comment a different color for easy viewing
reliveonce:                   function    Replace current version of srcode with older version, and back to normal

is Fastdb a metaclass: False
is Fastdb created by a metaclass: False
Fastdb is created by <class 'type'>
Fastdb.__new__ is object.__new__: True
Fastdb.__new__ is type.__new__: False
Fastdb.__new__: <built-in method __new__ of type object>
Fastdb.__init__ is object.__init__: False
Fastdb.__init__ is type.__init__: False
Fastdb.__init__: <function Fastdb.__init__>
Fastdb.__call__ is object.__call__: False
Fastdb.__call__ is type.__call__: False
Fastdb.__call__: <method-wrapper '__call__' of type object>
Fastdb.__class__: <class 'type'>
Fastdb.__bases__: (<class 'object'>,)
Fastdb.__mro__: (<class 'fastdebug.core.Fastdb'>, <class 'object'>)

Fastdb's function members are:
__init__: Create a Fastdebug class which has two functionalities: dbprint and print.
autoprint: print srcode with appropriate number of lines automatically
create_dbsrc_from_string: create dbsrc from a string
create_dbsrc_string: create the dbsrc string
create_explore_from_string: evaluate the explore dbsrc from string
create_explore_str: create the explore dbsrc string
create_snoop_from_string: evaluate the snoop dbsrc from string
create_snoop_str: creat the snoop dbsrc string
debug: to quickly check for clues of errors
docsrc: create dbsrc the string and turn the string into actual dbsrc function, we have self.dbsrcstr and self.dbsrc available from now on.
explore: insert 'import ipdb; ipdb.set_trace()' above srcline of idx to create dbsrc, and exec on dbsrc
goback: Return src back to original state.
print: Print the source code in whole or parts with idx and comments you added with dbprint along the way.
printcmts1: print the entire srcode and save it to a file if save=True
printcmts2: print the srcodes in parts
printtitle: print title which includes src name, line number under investigation, example.
replaceWithDbsrc: to replace self.orisrc.__name__ with 'self.dbsrc' and assign this new self.eg to self.eg
run_example: run self.eg with self.dbsrc
snoop: run snoop on the func or class under investigation only when example is available
takeoutExample: get the line of example code with srcode name in it

Fastdb's method members are:

Fastdb's class members are:
{'__class__': <class 'type'>}

Fastdb's namespace are:
mappingproxy({'__dict__': <attribute '__dict__' of 'Fastdb' objects>,
              '__doc__': None,
              '__init__': <function Fastdb.__init__>,
              '__module__': 'fastdebug.core',
              '__weakref__': <attribute '__weakref__' of 'Fastdb' objects>,
              'autoprint': <function Fastdb.autoprint>,
              'create_dbsrc_from_string': <function Fastdb.create_dbsrc_from_string>,
              'create_dbsrc_string': <function Fastdb.create_dbsrc_string>,
              'create_explore_from_string': <function Fastdb.create_explore_from_string>,
              'create_explore_str': <function Fastdb.create_explore_str>,
              'create_snoop_from_string': <function Fastdb.create_snoop_from_string>,
              'create_snoop_str': <function Fastdb.create_snoop_str>,
              'debug': <function Fastdb.debug>,
              'docsrc': <function Fastdb.docsrc>,
              'explore': <function Fastdb.explore>,
              'goback': <function Fastdb.goback>,
              'print': <function Fastdb.print>,
              'printcmts1': <function Fastdb.printcmts1>,
              'printcmts2': <function Fastdb.printcmts2>,
              'printtitle': <function Fastdb.printtitle>,
              'replaceWithDbsrc': <function Fastdb.replaceWithDbsrc>,
              'run_example': <function Fastdb.run_example>,
              'snoop': <function Fastdb.snoop>,
              'takeoutExample': <function Fastdb.takeoutExample>})

Did I document it in a notebook before?

run push-code-new in teminal to convert all current notebooks into mds

so that the followign search will get me the latest result if I did document similar things

fastnbs("what is fastdebug")

I can also extract all the notebook subheadings with the function below
and to check whether I have documented something similar by cmd + f and search keywords there


Did I document it in a src before?

fastcodes("how to access parameters")

keyword match is 1.0 , found a line: in _rm_self.py

    sigd = dict(sig.parameters)===========================================================(1) # how to access parameters from a signature; how is parameters stored in sig; how to turn parameters into a dict;; 

the entire source code in _rm_self.py

class Foo:
    def __init__(self, a, b:int=1): pass

def _rm_self(sig):========================================================================(0) # remove parameter self from a signature which has self;; 
    sigd = dict(sig.parameters)===========================================================(1) # how to access parameters from a signature; how is parameters stored in sig; how to turn parameters into a dict;; 
    sigd.pop('self')======================================================================(2) # how to remove the self parameter from the dict of sig;; 
    return sig.replace(parameters=sigd.values())==========================================(3) # how to update a sig using a updated dict of sig's parameters; 

I can check all the commented src files.


I can print out all the learning points as comments inside each src file

However, I need to figure out a way to extract them nicely from the files

Todos: how to comment src for list extraction

 test_sig(f:FunctionType or ClassType, b:str); test_sig will get f's signature as a string; b is a signature in string provided by the user; in fact, test_sig is to compare two strings; 
 test_sig is to test two strings with test_eq; how to turn a signature into a string;; 
 since t2 just references t these will be the same
 likewise, chaning an attribute on t will also affect t2 because they both point to the same object.
 both t and t2's __class__ is _T
 BypassNewMeta allows its instance class e.g., _T to choose a specific class e.g., _TestB and change `__class__` of an object e.g., t of _TestB to _T without creating a new object; 
 If the instance class like _T has attr '_new_meta', then run it with param x;; 
 when x is not an instance of _T's _bypass_type; or when a positional param is given; or when a keyword arg is given; let's run _T's super's __call__ function with x as param; and assign the result to x;  (4)
 If x.__class__ is not cls or _T, then make it so; 
 learn about /tmp folder https://www.fosslinux.com/41739/linux-tmp-directory-everything-you-need-to-know.htm                                       (1)
             exec(dbsrc, locals(), self.egEnv)                ===========================(6)       
     exec(code, globals().update(self.outenv), locals())  when dbsrc is a method, it will update as part of a class                                               (8)
 store dbsrc func inside Fastdb obj==================================================(9)       
 using __new__ of  FixSigMeta instead of type
 Any class having FixSigMeta as metaclass will have its own __init__ func stored in its attr __signature__;FixSigMeta uses its __new__ to create a class instance; then check whether its class instance has its own __init__;if so, remove self from the sig of __init__; then assign this new sig to __signature__ for the class instance;; 
 how does a metaclass create a class instance; what does super().__new__() do here;; 
 how to remove self from a signature; how to check whether a class' __init__ is inherited from object or not;;  (4)
 allows you to add method b upon instantiation
 don't forget to include **kwargs in __init__
 the attempt to add a is ignored and uses the original method instead.
 access the num attribute from the instance
 adds method b
 self.num + 5 = 10
multiply instead of add 
 add method b from the super class
 3 * 5 = 15
 how funcs_kwargs works; it is a wrapper around _funcs_kwargs; it offers two ways of running _funcs_kwargs; the first, default way, is to add a func to a class without using self; second way is to add func to class enabling self use;; 
 how to check whether an object is callable; how to return a result of running a func; ; 
 how to custom the params of `_funcs_kwargs` for a particular use with partial; 
 if `o` is not an object without an attribute `foo`, set foo = 1
 1 was not of type _T, so foo = 1
 t2 will now reference t
 t and t2 are the same object
 this will also change t.foo to 5 because it is the same object
 without any arguments the constructor will return a reference to the same object
 NewChkMeta is a metaclass inherited from FixSigMea; it makes its own __call__; when its class instance, e.g., _T, create object instances (e.g, t) without args nor kwargs but only x, and x is an object of the instance class, then return x; otherwise, create and return a new object created by the instance class's super class' __call__ method with x as param; In other words, t = _T(3) will create a new obj; _T(t) will return t; _T(t, 1) or _T(t, b=1) will also return a new obj; 
 how to create a __call__ method with param cls, x, *args, **kwargs;; 
 how to express no args and no kwargs and x is an instance of cls?; 
 how to call __call__ of super class with x and consider all possible situations of args and kwargs; 
 make sure self.orieg has no self inside===================(4)       
 how to use :=<, :=>, :=^ with format to align text to left, right, and middle;  (5)
 h=10 is initialized in the parent class
 AutoInit inherit __new__ and __init__ from object to create and initialize object instances; AutoInit uses PrePostInitMeta.__new__ or in fact FixSigMeta.__new__ to create its own class instance, which can have __signature__; AutoInit uses PrePostInitMeta.__call__ to specify how its object instance to be created and initialized (with pre_init, init, post_init)); AutoInit as a normal or non-metaclass, it writes its own __pre_init__ method; 
 how to run superclass' __init__ function; 
 how to test on the type of function or method
 `1` is a dummy instance since Py3 doesn't allow `None` any more=====================(2)       
 remove parameter self from a signature which has self;; 
 how to access parameters from a signature; how is parameters stored in sig; how to turn parameters into a dict;; 
 how to remove the self parameter from the dict of sig;; 
 how to update a sig using a updated dict of sig's parameters; 
 pprint and inspect is loaded from fastdebug
 Delegatee===========================================(0)  Keep `kwargs` in decorated function?==========================(1)       
 Exclude these parameters from signature===================(2)  how to write 2 ifs and elses in 2 lines; 
 how to assign a,b together with if and else; 
 Is classmethod callable; does classmethod has __func__; can we do inspect.signature(clsmethod); how to use getattr(obj, attr, default); 
 if B has __delwrap__, can we do delegates(A)(B) again?; hasattr(obj, '__delwrap__'); 
 how to get signature obj of B; what does a signature look like; what is the type; 
 How to access parameters of a signature?; How to turn parameters into a dict?; 
 How to remove an item from a dict?; How to get the removed item from a dict?; How to add the removed item back to the dict?; when writing expressions, as they share environment, so they may affect the following code; 
 How to access a signature's parameters as a dict?; How to replace the kind of a parameter with a different kind?; how to check whether a parameter has a default value?; How to check whether a string is in a dict and a list?; how dict.items() and dict.values() differ;  (14)
 How to get A's __annotations__?; How to access it as a dict?; How to select annotations of the right params with names?; How to put them into a dict?; How to do it all in a single line;  (16)
 How to add the selected params from A's signature to B's signature; How to add items into a dict;; 
 How to add a new item into a dict;; 
 How to create a new attr for a function or obj;; 
 How to update a signature with a new set of parameters;; 
 How to check whether a func has __annotations__; How add selected params' annotations from A to B's annotations;; 
 set with __pre_init__
 set with __init__
 set with __post_init__
 PrePostInitMeta inherit __new__ and __init__ from FixSigMeta as a metaclass (a different type); not from type, nor from object; PrePostInitMeta is itself a metaclass, which is used to create class instance not object instance; PrePostInitMeta writes its own __call__ which regulates how its class instance create and initialize object instance; 
 how to create an object instance with a cls; how to check the type of an object is cls; how to run a function without knowing its params;; 
 how to run __init__ without knowing its params; 
 allows you to add method b upon instantiation
 don't forget to include **kwargs in __init__
 the attempt to add a is ignored and uses the original method instead.
 how does _funcs_kwargs work: _funcs_kwargs is a decorator; it helps class e.g., T to add more methods; I need to give the method a name, and put the name e.g., 'b' inside a list called _methods=['b'] inside class T; then after writing a func e.g., _new_func, I can add it by T(b = _new_func); if I want the func added to class to use self, I shall write @funcs_kwargs(as_method=True); 
 how to define a method which can use self and accept any parameters; 
 how to pop out the value of an item in a dict (with None as default), and if the item name is not found, pop out None instead; ; 
 how to turn a func into a method; 
 how to give a method a different instance, like self; 
 how to add a method to a class as an attribute; 
 how to wrap `_init` around `old_init`, so that `_init` can use `old_init` inside itself; 
 how to add a list of names with None as default value to function `_init` to repalce its kwargs param; 
 how to make a class.`__init__` signature to be the signature of the class using `__signature__` and `_rm_self`;  (12)
 module, e.g., `import fastcore.all as fa`, use `fa` here=============(0)       
 print all items in __all__===============================(1)       
 print all user defined functions========================(2)       
 print all class objects=================================(3)       
 print all builtin funcs or methods=====================(4)       
 print all the modules of the library it belongs to=======(5)       
 print all callables=======================================(6)       
 how many items inside mo.__all__?; 
 get all funcs of a module; 
 get all classes from the module; 
 get the file path of the module; 
 get names of all modules of a lib; 
             print(f"{i[0]}: {kind}")  ==================================================(44)      
             print(f"{i[0]}: {kind}")  ==================================================(56)      

how to search and get a url of an image; how to download with an url; how to view an image;

from duckduckgo_search import ddg_images
from fastcore.all import *
def search_images(term, max_images=30):
    print(f"Searching for '{term}'")
    return L(ddg_images(term, max_results=max_images)).itemgot('image')
#NB: `search_images` depends on duckduckgo.com, which doesn't always return correct responses.
#    If you get a JSON error, just try running it again (it may take a couple of tries).
urls = search_images('bird photos', max_images=1)
Searching for 'bird photos'
from fastdownload import download_url
dest = 'bird.jpg'
download_url(urls[0], dest, show_progress=False)

from fastai.vision.all import *
im = Image.open(dest)

download_url(search_images('forest photos', max_images=1)[0], 'forest.jpg', show_progress=False)
Searching for 'forest photos'

how to create folders using path; how to search and download images in folders; how to resize images

Our searches seem to be giving reasonable results, so let’s grab 200 examples of each of “bird” and “forest” photos, and save each group of photos to a different folder:

searches = 'forest','bird'
path = Path('bird_or_not')
from time import sleep

for o in searches:
    dest = (path/o)
    dest.mkdir(exist_ok=True, parents=True)
    download_images(dest, urls=search_images(f'{o} photo'))
    sleep(10)  # Pause between searches to avoid over-loading server
    download_images(dest, urls=search_images(f'{o} sun photo'))
    download_images(dest, urls=search_images(f'{o} shade photo'))
    resize_images(path/o, max_size=400, dest=path/o)

Train my model

How to create a DataLoaders with DataBlock; how to view data with it

To train a model, we’ll need DataLoaders:

  1. a training set (the images used to create a model) and

  2. a validation set (the images used to check the accuracy of a model – not used during training).

We can view sample images from it:

dls = DataBlock(
    blocks=(ImageBlock, CategoryBlock), 
    splitter=RandomSplitter(valid_pct=0.2, seed=42),
    item_tfms=[Resize(192, method='squish')]


How to build my model with dataloaders and pretrained model; how to train my model

Now we’re ready to train our model. The fastest widely used computer vision model is resnet18. You can train this in a few minutes, even on a CPU! (On a GPU, it generally takes under 10 seconds…)

fastai comes with a helpful fine_tune() method which automatically uses best practices for fine tuning a pre-trained model, so we’ll use that.

learn = vision_learner(dls, resnet18, metrics=error_rate)

How to predict with my model; how to avoid running cells in nbdev_prepare

is_bird,_,probs = learn.predict(PILImage.create('bird.jpg'))
print(f"This is a: {is_bird}.")
print(f"Probability it's a bird: {probs[0]:.4f}")
