{"id":67078,"date":"2024-06-17T21:09:47","date_gmt":"2024-06-17T17:39:47","guid":{"rendered":"https:\/\/nabfollower.com\/blog\/wk-1-mlops-with-datatalks-5ah5\/"},"modified":"2024-06-17T21:09:47","modified_gmt":"2024-06-17T17:39:47","slug":"wk-1-mlops-with-datatalks-5ah5","status":"publish","type":"post","link":"https:\/\/nabfollower.com\/blog\/wk-1-mlops-with-datatalks-5ah5\/","title":{"rendered":"\u0647\u0641\u062a\u0647 1: MLOP \u0628\u0627 DataTalks"},"content":{"rendered":"<p>Summarize this content to 400 words in Persian Lang<br \/>\n            \u0627\u062e\u06cc\u0631\u0627\u064b \u0628\u0647 \u06af\u0631\u0648\u0647 DataTalks 2024 \u0645\u0644\u062d\u0642 \u0634\u062f\u0647 \u0627\u0633\u062a \u062a\u0627 \u0627\u0645\u062a\u06cc\u0627\u0632 \u06a9\u0633\u0628 \u06a9\u0646\u062f MLOs \u06af\u0648\u0627\u0647\u06cc \u0648 \u0627\u0633\u0627\u0633\u0627\u064b \u0628\u0631 \u0627\u0633\u0627\u0633 \u0634\u0627\u06cc\u0633\u062a\u06af\u06cc \u0647\u0627\u06cc Machine Pipeline \u0633\u0627\u062e\u062a\u0647 \u0634\u062f\u0647 \u0627\u0633\u062a.  \u0628\u0631\u0627\u06cc \u062a\u06a9\u0645\u06cc\u0644 \u062f\u0648\u0631\u0647 \u062a\u06a9\u0627\u0644\u06cc\u0641\u06cc \u0627\u0633\u062a \u06a9\u0647 \u0628\u0627\u06cc\u062f \u06cc\u06a9 \u0647\u0641\u062a\u0647 \u062f\u0631 \u0645\u06cc\u0627\u0646 \u0627\u0646\u062c\u0627\u0645 \u0634\u0648\u062f.<\/p>\n<p>\u0627\u06cc\u0646 \u0645\u062c\u0645\u0648\u0639\u0647 \u0627\u06cc \u0627\u0632 \u0646\u062d\u0648\u0647 \u0628\u0631\u062e\u0648\u0631\u062f \u0646\u0648\u06cc\u0633\u0646\u062f\u0647 \u0628\u0627 \u0627\u06cc\u0646 \u062a\u06a9\u0627\u0644\u06cc\u0641 \u062e\u0648\u0627\u0647\u062f \u0628\u0648\u062f \u0648 \u0628\u0647 \u0639\u0646\u0648\u0627\u0646 \u0631\u0627\u0647 \u062d\u0644\u06cc \u0628\u0631\u0627\u06cc \u06a9\u0633\u0627\u0646\u06cc \u06a9\u0647 \u062f\u0631 \u062d\u0627\u0644 \u0645\u0628\u0627\u0631\u0632\u0647 \u0647\u0633\u062a\u0646\u062f \u0639\u0645\u0644 \u0645\u06cc \u06a9\u0646\u062f.<\/p>\n<p>\u0647\u0641\u062a\u0647 1\u062a\u06a9\u0644\u06cc\u0641 \u062f\u0631 \u0627\u06cc\u0646\u062c\u0627 \u0627\u0633\u0627\u0633\u06cc \u0627\u0633\u062a\u060c \u0634\u0645\u0627 \u0628\u0627\u06cc\u062f \u0645\u0647\u0627\u0631\u062a \u0647\u0627\u06cc \u0644\u0627\u0632\u0645 \u0631\u0627 \u062f\u0631 \u067e\u0627\u06cc\u062a\u0648\u0646\u060c \u06a9\u062a\u0627\u0628\u062e\u0627\u0646\u0647 \u0647\u0627\u06cc ml \u0648 \u0627\u0633\u06a9\u0631\u06cc\u067e\u062a \u0646\u0648\u06cc\u0633\u06cc bash \u062f\u0627\u0634\u062a\u0647 \u0628\u0627\u0634\u06cc\u062f \u062a\u0627 \u0627\u06cc\u0646 \u06a9\u0627\u0631 \u0631\u0627 \u06a9\u0627\u0645\u0644 \u0628\u0628\u06cc\u0646\u06cc\u062f.  \u062a\u06a9\u0627\u0644\u06cc\u0641 \u0632\u06cc\u0631 \u0631\u0627 \u0628\u0628\u06cc\u0646\u06cc\u062f:<\/p>\n<p>\u06cc\u06a9 \u0631\u0627 \u062e\u0648\u0627\u0647\u06cc\u0645 \u0633\u0627\u062e\u062a \u0646\u0648\u062a \u0628\u0648\u06a9 \u0698\u0648\u067e\u06cc\u062a\u0631 \u06a9\u0647 \u0628\u0647 \u0647\u0631 \u0633\u0648\u0627\u0644 \u0645\u06cc \u067e\u0631\u062f\u0627\u0632\u062f. <\/p>\n<p>\u0642\u0628\u0644 \u0627\u0632 \u0647\u0631 \u0686\u06cc\u0632\u06cc\u060c \u06cc\u06a9 \u062f\u0627\u06cc\u0631\u06a9\u062a\u0648\u0631\u06cc \u0627\u06cc\u062c\u0627\u062f \u06a9\u0646\u06cc\u062f \u062a\u0627 \u06a9\u0627\u0631\u0647\u0627\u06cc \u062e\u0648\u062f \u0631\u0627 \u062f\u0631 \u062d\u0627\u0644 \u062d\u0627\u0636\u0631 \u0648 \u0628\u0639\u062f\u0627\u064b \u062f\u0631 \u062e\u0648\u062f \u062c\u0627\u06cc \u062f\u0647\u062f\u060c \u0645\u0627\u0646\u0646\u062f:<\/p>\n<p>MLOPS<br \/>\n |<br \/>\n &#8211; wk1<\/p>\n<p>    \u0648\u0627\u0631\u062f \u062d\u0627\u0644\u062a \u062a\u0645\u0627\u0645 \u0635\u0641\u062d\u0647 \u0634\u0648\u06cc\u062f<\/p>\n<p>    \u0627\u0632 \u062d\u0627\u0644\u062a \u062a\u0645\u0627\u0645 \u0635\u0641\u062d\u0647 \u062e\u0627\u0631\u062c \u0634\u0648\u06cc\u062f<\/p>\n<p>\u0633\u067e\u0633 \u06cc\u06a9 \u0645\u062d\u06cc\u0637 \u0645\u062c\u0627\u0632\u06cc \u062f\u0631 \u062f\u0627\u06cc\u0631\u06a9\u062a\u0648\u0631\u06cc \u0648\u0627\u0644\u062f \u0627\u06cc\u062c\u0627\u062f \u06a9\u0646\u06cc\u062f\u060c \u0627\u06cc\u0646 \u062c\u0627\u06cc\u06cc \u0627\u0633\u062a \u06a9\u0647 \u062a\u0645\u0627\u0645 \u0628\u0633\u062a\u0647 \u0647\u0627\u06cc \u0645\u0648\u0631\u062f \u0646\u06cc\u0627\u0632 \u0628\u0631\u0627\u06cc \u06a9\u0644 \u0633\u0641\u0631 \u0631\u0627 \u0646\u0635\u0628 \u0645\u06cc \u06a9\u0646\u06cc\u062f:<\/p>\n<p>bash \u062e\u0648\u062f \u0631\u0627 \u0631\u0627\u0647 \u0627\u0646\u062f\u0627\u0632\u06cc \u06a9\u0646\u06cc\u062f \u0648 \u062f\u0633\u062a\u0648\u0631\u0627\u062a \u0632\u06cc\u0631 \u0631\u0627 \u0627\u062c\u0631\u0627 \u06a9\u0646\u06cc\u062f:<\/p>\n<p>cd MLOPS<br \/>\npython3.10 -m venv MLOPS_venv<br \/>\nsource MLOPS_venv\/Scripts\/activate<\/p>\n<p>    \u0648\u0627\u0631\u062f \u062d\u0627\u0644\u062a \u062a\u0645\u0627\u0645 \u0635\u0641\u062d\u0647 \u0634\u0648\u06cc\u062f<\/p>\n<p>    \u0627\u0632 \u062d\u0627\u0644\u062a \u062a\u0645\u0627\u0645 \u0635\u0641\u062d\u0647 \u062e\u0627\u0631\u062c \u0634\u0648\u06cc\u062f<\/p>\n<p>\u0627\u06cc\u0646 \u0628\u0627\u0639\u062b \u0627\u06cc\u062c\u0627\u062f \u0645\u062d\u06cc\u0637 \u0645\u062c\u0627\u0632\u06cc \u0645\u06cc \u0634\u0648\u062f MLOPS_venv \u0628\u0627 Python 3.10 \u0648 \u0647\u0645\u0686\u0646\u06cc\u0646 \u062f\u0627\u06cc\u0631\u06a9\u062a\u0648\u0631\u06cc \u062f\u0631 \u067e\u0648\u0634\u0647 \u0648\u0627\u0644\u062f \u062e\u0648\u062f \u0628\u0627 \u0647\u0645\u06cc\u0646 \u0646\u0627\u0645\u060c \u0627\u06a9\u0646\u0648\u0646 \u0645\u06cc \u062a\u0648\u0627\u0646\u06cc\u062f \u0628\u0633\u062a\u0647 \u0647\u0627 \u0631\u0627 \u062f\u0631 \u0627\u06cc\u0646 \u0645\u062d\u06cc\u0637 \u0646\u0635\u0628 \u06a9\u0646\u06cc\u062f.  \u062e\u0637 \u0622\u062e\u0631 \u0641\u0639\u0627\u0644 \u06a9\u0631\u062f\u0646 \u0627\u06cc\u0646 \u0645\u062d\u06cc\u0637 \u0627\u0633\u062a.<\/p>\n<p>\u0648 \u0628\u0631\u0627\u06cc \u063a\u06cc\u0631 \u0641\u0639\u0627\u0644 \u06a9\u0631\u062f\u0646:deactivate<\/p>\n<p>  Wk1:<\/p>\n<p>  \u0628\u0631\u067e\u0627\u06cc\u06cc<\/p>\n<p>mkdir wk1<br \/>\ncd wk1<br \/>\nmkdir datasets<br \/>\ncode .<\/p>\n<p>    \u0648\u0627\u0631\u062f \u062d\u0627\u0644\u062a \u062a\u0645\u0627\u0645 \u0635\u0641\u062d\u0647 \u0634\u0648\u06cc\u062f<\/p>\n<p>    \u0627\u0632 \u062d\u0627\u0644\u062a \u062a\u0645\u0627\u0645 \u0635\u0641\u062d\u0647 \u062e\u0627\u0631\u062c \u0634\u0648\u06cc\u062f<\/p>\n<p>\u0627\u06cc\u0646 \u0628\u0627\u0639\u062b \u0645\u06cc \u0634\u0648\u062f wk1 \u062f\u0627\u06cc\u0631\u06a9\u062a\u0648\u0631\u06cc \u0628\u0647 \u06a9\u0627\u0631 \u0627\u06cc\u0646 \u0647\u0641\u062a\u0647\u060c \u0627\u06af\u0631 \u0642\u0628\u0644\u0627\u064b \u0622\u0646 \u0631\u0627 \u0627\u06cc\u062c\u0627\u062f \u0646\u06a9\u0631\u062f\u0647\u200c\u0627\u06cc\u062f\u060c \u0628\u0631\u0627\u06cc \u0627\u06cc\u062c\u0627\u062f \u06cc\u06a9 \u0628\u0647 \u062f\u0627\u062e\u0644 \u0622\u0646 \u067e\u06cc\u0645\u0627\u06cc\u0634 \u06a9\u0646\u06cc\u062f \u0645\u062c\u0645\u0648\u0639\u0647 \u062f\u0627\u062f\u0647 \u0647\u0627 \u0632\u06cc\u0631 \u0634\u0627\u062e\u0647 \u0627\u06cc \u06a9\u0647 \u067e\u0633 \u0627\u0632 \u0622\u0646 \u06a9\u062f VS \u0631\u0627 \u0631\u0627\u0647 \u0627\u0646\u062f\u0627\u0632\u06cc \u0645\u06cc \u06a9\u0646\u062f\u060c Ctrl+Shift+P \u062f\u0633\u062a\u0648\u0631 \u0627\u06cc\u062c\u0627\u062f \u06cc\u06a9 \u0646\u0648\u062a \u0628\u0648\u06a9\u060c \u0646\u0627\u0645 \u0622\u0646 \u0627\u0633\u062a \u0645\u0634\u0642 \u0634\u0628.<\/p>\n<p>\u0647\u0646\u06af\u0627\u0645\u06cc \u06a9\u0647 \u0627\u06cc\u0646 \u0645\u0648\u0631\u062f \u0627\u06cc\u062c\u0627\u062f \u0634\u062f\u060c \u0645\u0637\u0645\u0626\u0646 \u0634\u0648\u06cc\u062f \u06a9\u0647 \u0647\u0633\u062a\u0647 \u0631\u0627 \u0631\u0648\u06cc the \u062a\u0646\u0638\u06cc\u0645 \u06a9\u0631\u062f\u0647 \u0627\u06cc\u062f MLOPS_venv \u0645\u062d\u06cc\u0637. <\/p>\n<p>  \u0646\u0648\u062a \u0628\u0648\u06a9 \u0698\u0648\u067e\u06cc\u062a\u0631<\/p>\n<p>\u062f\u0631 \u0634\u0645\u0627 \u062a\u06a9\u0627\u0644\u06cc\u0641.ipynb \u0641\u0627\u06cc\u0644 \u0646\u0648\u062a \u0628\u0648\u06a9 \u0627\u062c\u0631\u0627 \u06a9\u0646\u06cc\u062f !ls \u0628\u0631\u0627\u06cc \u0627\u06cc\u0646\u06a9\u0647 \u0628\u0628\u06cc\u0646\u06cc\u062f \u062f\u0627\u06cc\u0631\u06a9\u062a\u0648\u0631\u06cc \u0647\u0627\u06cc \u0645\u0648\u0631\u062f \u0646\u06cc\u0627\u0632 \u0631\u0627 \u062f\u0627\u0631\u06cc\u062f\u060c \u0628\u0627\u06cc\u062f \u0628\u0647 \u0634\u06a9\u0644 \u0632\u06cc\u0631 \u0628\u0627\u0634\u062f:<\/p>\n<p>\u0633\u067e\u0633 \u0686\u0646\u062f \u06a9\u062a\u0627\u0628\u062e\u0627\u0646\u0647 \u0645\u0627\u0646\u0646\u062f \u0627\u06cc\u0646 \u0631\u0627 \u0646\u0635\u0628 \u06a9\u0646\u06cc\u062f:## Install Packages!pip install numpy pandas seaborn scikit-learn<\/p>\n<p>! &#8211; \u0627\u06cc\u0646 \u062f\u0631 \u0646\u0648\u062a \u0628\u0648\u06a9 \u0647\u0627\u06cc Jupyter \u0628\u0631\u0627\u06cc \u0627\u062c\u0631\u0627\u06cc \u062f\u0633\u062a\u0648\u0631\u0627\u062a \u067e\u0648\u0633\u062a\u0647 \u0627\u0633\u062a\u0641\u0627\u062f\u0647 \u0645\u06cc \u0634\u0648\u062f.<\/p>\n<p>Q1: \u062a\u0627\u06a9\u0633\u06cc \u0647\u0627\u06cc \u0633\u0628\u0632 &#8211; \u062f\u0627\u062f\u0647 \u0647\u0627\u06cc \u0698\u0627\u0646\u0648\u06cc\u0647 \u0648 \u0641\u0648\u0631\u06cc\u0647 2023 \u0631\u0627 \u0628\u0627\u0631\u06af\u06cc\u0631\u06cc \u06a9\u0646\u06cc\u062f.1.1 \u062f\u0627\u0646\u0644\u0648\u062f \u0645\u062c\u0645\u0648\u0639\u0647 \u062f\u0627\u062f\u0647 \u0647\u0627<\/p>\n<p>## Download Yellow Taxi Trips Files<br \/>\n! curl -o .\/datasets\/jan_yellow.parquet https:\/\/d37ci6vzurychx.cloudfront.net\/trip-data\/yellow_tripdata_2023-01.parquet<br \/>\n! curl -o .\/datasets\/feb_yellow.parquet https:\/\/d37ci6vzurychx.cloudfront.net\/trip-data\/yellow_tripdata_2023-02.parquet<\/p>\n<p>    \u0648\u0627\u0631\u062f \u062d\u0627\u0644\u062a \u062a\u0645\u0627\u0645 \u0635\u0641\u062d\u0647 \u0634\u0648\u06cc\u062f<\/p>\n<p>    \u0627\u0632 \u062d\u0627\u0644\u062a \u062a\u0645\u0627\u0645 \u0635\u0641\u062d\u0647 \u062e\u0627\u0631\u062c \u0634\u0648\u06cc\u062f<\/p>\n<p>\u062d\u0644\u0642\u0647 \u0627\u0628\u0632\u0627\u0631\u06cc \u0628\u0631\u0627\u06cc \u0627\u0646\u062a\u0642\u0627\u0644 \u062f\u0627\u062f\u0647 \u0647\u0627 \u0627\u0632 \u06cc\u0627 \u0628\u0647 \u0633\u0631\u0648\u0631 \u0627\u0633\u062a.  \u062f\u0631 \u0627\u06cc\u0646\u062c\u0627 \u0686\u06cc\u0632\u06cc \u0627\u0633\u062a \u06a9\u0647 \u0647\u0631 \u0628\u062e\u0634 \u0627\u0646\u062c\u0627\u0645 \u0645\u06cc \u062f\u0647\u062f:<\/p>\n<p>curl  &#8211; \u0627\u0628\u0632\u0627\u0631 \u062e\u0637 \u0641\u0631\u0645\u0627\u0646 \u0628\u0631\u0627\u06cc \u062f\u0631\u062e\u0648\u0627\u0633\u062a \u0628\u0647 URL \u0647\u0627.-o- \u0627\u06cc\u0646 \u06af\u0632\u06cc\u0646\u0647 \u0628\u0647 curl \u0645\u06cc \u06af\u0648\u06cc\u062f \u06a9\u0647 \u062e\u0631\u0648\u062c\u06cc \u0631\u0627 \u0628\u0647 \u062c\u0627\u06cc \u0646\u0645\u0627\u06cc\u0634 \u062f\u0631 \u06cc\u06a9 \u0641\u0627\u06cc\u0644 \u0630\u062e\u06cc\u0631\u0647 \u06a9\u0646\u062f..\/datasets\/jan_yellow.parquet  &#8211; \u0645\u0633\u06cc\u0631\u06cc \u06a9\u0647 \u0627\u0648\u0644\u06cc\u0646 \u0641\u0627\u06cc\u0644 \u062f\u0631 \u0622\u0646 \u0630\u062e\u06cc\u0631\u0647 \u062e\u0648\u0627\u0647\u062f \u0634\u062f.<\/p>\n<p>\u0628\u0646\u0627\u0628\u0631\u0627\u06cc\u0646\u060c \u0627\u0648\u0644\u06cc\u0646 \u062f\u0633\u062a\u0648\u0631 \u06cc\u06a9 \u0641\u0627\u06cc\u0644 \u0628\u0647 \u0646\u0627\u0645 \u0631\u0627 \u062f\u0627\u0646\u0644\u0648\u062f \u0645\u06cc \u06a9\u0646\u062f yellow_tripdata_2023-01.parquet \u0627\u0632 URL \u062f\u0627\u062f\u0647 \u0634\u062f\u0647 \u0648 \u0622\u0646 \u0631\u0627 \u0628\u0647 \u0639\u0646\u0648\u0627\u0646 \u0630\u062e\u06cc\u0631\u0647 \u0645\u06cc \u06a9\u0646\u062f jan_yellow.\u067e\u0627\u0631\u06a9\u062a \u062f\u0631 datasets \u0641\u0647\u0631\u0633\u062a \u0631\u0627\u0647\u0646\u0645\u0627.<\/p>\n<p>\u062f\u0633\u062a\u0648\u0631 \u062f\u0648\u0645 \u0647\u0645\u06cc\u0646 \u06a9\u0627\u0631 \u0631\u0627 \u0628\u0631\u0627\u06cc \u0641\u0627\u06cc\u0644 \u062f\u06cc\u06af\u0631\u06cc \u0627\u0646\u062c\u0627\u0645 \u0645\u06cc \u062f\u0647\u062f \u0648 \u0622\u0646 \u0631\u0627 \u0628\u0647 \u0639\u0646\u0648\u0627\u0646 \u0630\u062e\u06cc\u0631\u0647 \u0645\u06cc \u06a9\u0646\u062f feb_yellow.\u067e\u0627\u0631\u06a9\u062a.<\/p>\n<p>1.2 \u0648\u0627\u0631\u062f\u0627\u062a \u06a9\u062a\u0627\u0628\u062e\u0627\u0646\u0647 \u0647\u0627<\/p>\n<p>## Load Libraries<br \/>\nimport numpy as np<br \/>\nimport pandas as pd<br \/>\nfrom sklearn.feature_extraction import DictVectorizer<\/p>\n<p>## Load Dataset<br \/>\njan_df = pd.read_parquet(&#8220;.\/datasets\/jan_yellow.parquet&#8221;)<br \/>\nprint(f&#8221;1, Data Dimension: {jan_df.shape[0]} rows | {jan_df.shape[1]} columns \\n&#8221;)<\/p>\n<p>    \u0648\u0627\u0631\u062f \u062d\u0627\u0644\u062a \u062a\u0645\u0627\u0645 \u0635\u0641\u062d\u0647 \u0634\u0648\u06cc\u062f<\/p>\n<p>    \u0627\u0632 \u062d\u0627\u0644\u062a \u062a\u0645\u0627\u0645 \u0635\u0641\u062d\u0647 \u062e\u0627\u0631\u062c \u0634\u0648\u06cc\u062f<\/p>\n<p>=> \u0627\u0628\u0639\u0627\u062f \u062f\u0627\u062f\u0647: 3066766 \u0631\u062f\u06cc\u0641 |  20 \u0633\u062a\u0648\u0646 \u062e\u0631\u0648\u062c\u06cc \u0627\u06cc\u0646 \u067e\u0627\u0633\u062e \u0633\u0648\u0627\u0644 \u0631\u0627 \u0628\u0631\u0645\u06cc \u06af\u0631\u062f\u0627\u0646\u062f.<\/p>\n<p>Q2: \u0645\u062a\u063a\u06cc\u0631 \u0645\u062f\u062a \u0632\u0645\u0627\u0646 (\u0628\u0631 \u062d\u0633\u0628 \u062f\u0642\u06cc\u0642\u0647) \u0631\u0627 \u0645\u062d\u0627\u0633\u0628\u0647 \u06a9\u0646\u06cc\u062f \u0648 \u0627\u0646\u062d\u0631\u0627\u0641 \u0627\u0633\u062a\u0627\u0646\u062f\u0627\u0631\u062f \u0645\u062f\u062a \u0633\u0641\u0631 \u062f\u0631 \u0698\u0627\u0646\u0648\u06cc\u0647 \u0631\u0627 \u0648\u0627\u06a9\u0634\u06cc \u06a9\u0646\u06cc\u062f\u061f<\/p>\n<p>2.1 \u0645\u062d\u0627\u0633\u0628\u0647 \u0645\u062f\u062a \u0633\u0641\u0631 \u0648 \u0627\u0646\u062d\u0631\u0627\u0641 Std<\/p>\n<p>jan_df[[&#8220;tpep_pickup_datetime&#8221;, &#8220;tpep_dropoff_datetime&#8221;]] = jan_df[[&#8220;tpep_pickup_datetime&#8221;, &#8220;tpep_dropoff_datetime&#8221;]].apply(pd.to_datetime)<br \/>\njan_df[&#8220;duration&#8221;] = (jan_df[&#8220;tpep_dropoff_datetime&#8221;] &#8211; jan_df[&#8220;tpep_pickup_datetime&#8221;]).dt.total_seconds()\/60<\/p>\n<p>print(f&#8221;2, Duration Standard Deviation: {jan_df[&#8216;duration&#8217;].std()} \\n&#8221;)<\/p>\n<p>    \u0648\u0627\u0631\u062f \u062d\u0627\u0644\u062a \u062a\u0645\u0627\u0645 \u0635\u0641\u062d\u0647 \u0634\u0648\u06cc\u062f<\/p>\n<p>    \u0627\u0632 \u062d\u0627\u0644\u062a \u062a\u0645\u0627\u0645 \u0635\u0641\u062d\u0647 \u062e\u0627\u0631\u062c \u0634\u0648\u06cc\u062f<\/p>\n<p>=> \u0627\u0646\u062d\u0631\u0627\u0641 \u0627\u0633\u062a\u0627\u0646\u062f\u0627\u0631\u062f \u0645\u062f\u062a \u0632\u0645\u0627\u0646: 42.59435124195458.<\/p>\n<p>\u0627\u06cc\u0646 \u06a9\u062f \u0633\u062a\u0648\u0646 \u0647\u0627\u06cc &#8220;tpep_pickup_datetime&#8221; \u0648 &#8220;tpep_dropoff_datetime&#8221; \u0631\u0627 \u062f\u0631 jan_df DataFrame \u0628\u0627 \u0627\u0633\u062a\u0641\u0627\u062f\u0647 \u0627\u0632 \u062a\u0627\u0628\u0639 to_datetime pandas \u0628\u0647 \u0627\u0634\u06cc\u0627\u0621 datetime \u062a\u0628\u062f\u06cc\u0644 \u0645\u06cc \u06a9\u0646\u062f.  \u0633\u067e\u0633\u060c \u0645\u062f\u062a \u0632\u0645\u0627\u0646 \u0647\u0631 \u0633\u0641\u0631 \u0631\u0627 \u0628\u0627 \u06a9\u0645 \u06a9\u0631\u062f\u0646 \u0632\u0645\u0627\u0646 \u062a\u062d\u0648\u06cc\u0644 \u0627\u0632 \u0632\u0645\u0627\u0646 \u062a\u062e\u0644\u06cc\u0647\u060c \u062a\u0628\u062f\u06cc\u0644 \u0646\u062a\u06cc\u062c\u0647 \u0628\u0647 \u062b\u0627\u0646\u06cc\u0647 \u06a9\u0644 \u0648 \u0633\u067e\u0633 \u062a\u0642\u0633\u06cc\u0645 \u0628\u0631 60 \u0628\u0631\u0627\u06cc \u0628\u062f\u0633\u062a \u0622\u0648\u0631\u062f\u0646 \u0645\u062f\u062a \u0632\u0645\u0627\u0646 \u0628\u0631 \u062d\u0633\u0628 \u062f\u0642\u06cc\u0642\u0647 \u0645\u062d\u0627\u0633\u0628\u0647 \u0645\u06cc \u06a9\u0646\u062f.<\/p>\n<p>Q3: \u062d\u0630\u0641 Outliers<\/p>\n<p>filtered_duration = jan_df[jan_df[&#8216;duration&#8217;].between(1,60)]\nclean_prop = len(filtered_duration[&#8216;duration&#8217;])\/len(jan_df[&#8216;duration&#8217;])<\/p>\n<p>print(f&#8221;3, Outlier Proportion: {clean_prop} \\n&#8221;)<\/p>\n<p>    \u0648\u0627\u0631\u062f \u062d\u0627\u0644\u062a \u062a\u0645\u0627\u0645 \u0635\u0641\u062d\u0647 \u0634\u0648\u06cc\u062f<\/p>\n<p>    \u0627\u0632 \u062d\u0627\u0644\u062a \u062a\u0645\u0627\u0645 \u0635\u0641\u062d\u0647 \u062e\u0627\u0631\u062c \u0634\u0648\u06cc\u062f<\/p>\n<p>~> 98\u066a\u0641\u06cc\u0644\u062a\u0631 \u0645\u06cc \u06a9\u0646\u06cc\u0645 jan_df \u0641\u0642\u0637 \u0633\u0637\u0631\u0647\u0627\u06cc\u06cc \u0631\u0627 \u0634\u0627\u0645\u0644 \u0634\u0648\u062f \u06a9\u0647 \u0645\u0642\u0627\u062f\u06cc\u0631 \u0633\u062a\u0648\u0646 &#8220;\u0645\u062f\u062a&#8221; \u0628\u06cc\u0646 1 \u062a\u0627 60 \u062f\u0642\u06cc\u0642\u0647 \u0627\u0633\u062a.  \u0627\u06cc\u0646 \u062d\u062f\u0648\u062f 98 \u062f\u0631\u0635\u062f \u0627\u0632 \u062f\u06cc\u062a\u0627\u0641\u0631\u06cc\u0645 \u0627\u0648\u0644\u06cc\u0647 \u0631\u0627 \u062a\u0634\u06a9\u06cc\u0644 \u0645\u06cc \u062f\u0647\u062f.<\/p>\n<p>Q4: \u0627\u0628\u0639\u0627\u062f\u06cc \u0628\u0648\u062f\u0646 \u0645\u0627\u062a\u0631\u06cc\u0633 \u0648\u06cc\u0698\u06af\u06cc<\/p>\n<p>## Filtered columns<br \/>\nml_df = filtered_duration[[&#8216;PULocationID&#8217;, &#8216;DOLocationID&#8217;]].astype(str)<br \/>\nml_df[&#8216;duration&#8217;] = filtered_duration[&#8216;duration&#8217;]\n<p>## Dictionaries<br \/>\ndicts_train = ml_df[[&#8216;PULocationID&#8217;, &#8216;DOLocationID&#8217;]].to_dict(orient=&#8221;records&#8221;)<br \/>\ndicts_train[1:5]\n<p>## Vectorizers<br \/>\nvec = DictVectorizer(sparse = True)<br \/>\nfeature_matrix = vec.fit_transform(dicts_train)<\/p>\n<p>print(f&#8221;4, Dimension of feature_matrix: {feature_matrix.shape} \\n&#8221;)<\/p>\n<p>    \u0648\u0627\u0631\u062f \u062d\u0627\u0644\u062a \u062a\u0645\u0627\u0645 \u0635\u0641\u062d\u0647 \u0634\u0648\u06cc\u062f<\/p>\n<p>    \u0627\u0632 \u062d\u0627\u0644\u062a \u062a\u0645\u0627\u0645 \u0635\u0641\u062d\u0647 \u062e\u0627\u0631\u062c \u0634\u0648\u06cc\u062f<\/p>\n<p>=> 4\u060c \u0627\u0628\u0639\u0627\u062f feature_matrix: (3009173\u060c 515)<\/p>\n<p>\u0627\u06cc\u0646 \u06a9\u062f \u06a9\u0627\u0631\u0647\u0627\u06cc \u0632\u06cc\u0631 \u0631\u0627 \u0627\u0646\u062c\u0627\u0645 \u0645\u06cc \u062f\u0647\u062f:<\/p>\n<p>\u06cc\u06a9 DataFrame ml_df \u062c\u062f\u06cc\u062f \u0628\u0627 \u0633\u062a\u0648\u0646\u200c\u0647\u0627\u06cc \u00abPULocationID\u00bb \u0648 \u00abDOlocationID\u00bb \u0627\u0632 filtered_duration \u0627\u06cc\u062c\u0627\u062f \u0645\u06cc\u200c\u06a9\u0646\u062f \u0648 \u0622\u0646\u0647\u0627 \u0631\u0627 \u0628\u0647 \u0631\u0634\u062a\u0647\u200c\u0647\u0627 \u062a\u0628\u062f\u06cc\u0644 \u0645\u06cc\u200c\u06a9\u0646\u062f.<br \/>\n\u0633\u062a\u0648\u0646 &#8220;\u062f\u0648\u0631\u0647&#8221; \u0631\u0627 \u0627\u0632 filtered_duration \u0628\u0647 ml_df \u0627\u0636\u0627\u0641\u0647 \u0645\u06cc \u06a9\u0646\u062f.<br \/>\nml_df \u0631\u0627 \u0628\u0627 \u0627\u0633\u062a\u0641\u0627\u062f\u0647 \u0627\u0632 \u0631\u0648\u0634 to_dict \u0628\u0627 orient=&#8221;records&#8221; \u0628\u0647 \u0641\u0647\u0631\u0633\u062a\u06cc \u0627\u0632 \u062f\u06cc\u06a9\u0634\u0646\u0631\u06cc\u200c\u0647\u0627 \u0628\u0627 \u06a9\u0644\u06cc\u062f\u0647\u0627\u06cc &#8220;PULocationID&#8221; \u0648 &#8220;DOLocationID&#8221; \u062a\u0628\u062f\u06cc\u0644 \u0645\u06cc\u200c\u06a9\u0646\u062f.<br \/>\n\u06cc\u06a9 DictVectorizer \u0631\u0627 \u0631\u0627\u0647 \u0627\u0646\u062f\u0627\u0632\u06cc \u0645\u06cc \u06a9\u0646\u062f \u06a9\u0647 \u0628\u0631\u0627\u06cc \u062a\u0628\u062f\u06cc\u0644 \u0644\u06cc\u0633\u062a \u0641\u0631\u0647\u0646\u06af \u0644\u063a\u062a \u0647\u0627 \u0628\u0647 \u0645\u0627\u062a\u0631\u06cc\u0633\u06cc \u0627\u0632 \u0648\u06cc\u0698\u06af\u06cc \u0647\u0627\u06cc \u0645\u062f\u0644 \u0647\u0627\u06cc \u06cc\u0627\u062f\u06af\u06cc\u0631\u06cc \u0645\u0627\u0634\u06cc\u0646 \u0627\u0633\u062a\u0641\u0627\u062f\u0647 \u0645\u06cc \u0634\u0648\u062f.<br \/>\n\u0641\u0647\u0631\u0633\u062a \u062f\u06cc\u06a9\u0634\u0646\u0631\u06cc \u0647\u0627 \u0631\u0627 \u0628\u0647 \u06cc\u06a9 \u0645\u0627\u062a\u0631\u06cc\u0633 \u067e\u0631\u0627\u06a9\u0646\u062f\u0647 feature_matrix \u062a\u0628\u062f\u06cc\u0644 \u0645\u06cc \u06a9\u0646\u062f.<br \/>\n\u0627\u0628\u0639\u0627\u062f feature_matrix \u0631\u0627 \u0686\u0627\u067e \u0645\u06cc \u06a9\u0646\u062f.<br \/>\n\u062e\u0631\u0648\u062c\u06cc \u062a\u0639\u062f\u0627\u062f \u0633\u0637\u0631\u0647\u0627 \u0648 \u0633\u062a\u0648\u0646 \u0647\u0627 \u062f\u0631 \u0645\u0627\u062a\u0631\u06cc\u0633 \u0648\u06cc\u0698\u06af\u06cc \u0631\u0627 \u0646\u0634\u0627\u0646 \u0645\u06cc \u062f\u0647\u062f.<\/p>\n<p>Q5: \u0622\u0645\u0648\u0632\u0634 \u06cc\u06a9 \u0645\u062f\u0644 \u0631\u06af\u0631\u0633\u06cc\u0648\u0646 \u062e\u0637\u06cc<\/p>\n<p>## Linear Regression Model<br \/>\nfrom sklearn.linear_model import LinearRegression<br \/>\nfrom sklearn.metrics import mean_squared_error<\/p>\n<p>y = ml_df[&#8216;duration&#8217;]\n<p>model = LinearRegression()<br \/>\nmodel.fit(feature_matrix, y)<br \/>\ny_pred = model.predict(feature_matrix)<br \/>\nrmse = np.sqrt(mean_squared_error(y, y_pred))<\/p>\n<p>print(f&#8221;5, RMSE: {rmse}&#8221;)<\/p>\n<p>    \u0648\u0627\u0631\u062f \u062d\u0627\u0644\u062a \u062a\u0645\u0627\u0645 \u0635\u0641\u062d\u0647 \u0634\u0648\u06cc\u062f<\/p>\n<p>    \u0627\u0632 \u062d\u0627\u0644\u062a \u062a\u0645\u0627\u0645 \u0635\u0641\u062d\u0647 \u062e\u0627\u0631\u062c \u0634\u0648\u06cc\u062f<\/p>\n<p>=> RMSE: 7.649262236295703<\/p>\n<p>\u062f\u0631 \u0627\u06cc\u0646\u062c\u0627\u060c \u0645\u0627 \u0628\u0647 \u0631\u06af\u0631\u0633\u06cc\u0648\u0646 \u062e\u0637\u06cc \u0645\u062f\u0644\u200c\u0647\u0627\u06cc\u06cc \u0627\u0632 \u06a9\u062a\u0627\u0628\u062e\u0627\u0646\u0647 scikit-learn\/sklearn \u0648 \u0622\u0645\u0648\u0632\u0634 \u0631\u0648\u06cc \u0645\u062a\u063a\u06cc\u0631 \u0647\u062f\u0641 \u0645\u062f\u062a \u0632\u0645\u0627\u0646\u060c \u0645\u062f\u0644 \u0631\u0627 \u062f\u0631 \u06cc\u06a9 \u0645\u0627\u062a\u0631\u06cc\u0633 \u0648\u06cc\u0698\u06af\u06cc \u0628\u0631\u0627\u0632\u0634 \u062f\u0647\u06cc\u062f \u0648 \u0633\u067e\u0633 \u067e\u06cc\u0634 \u0628\u06cc\u0646\u06cc \u06a9\u0646\u06cc\u062f. <\/p>\n<p>\u0631\u06cc\u0634\u0647 \u0645\u06cc\u0627\u0646\u06af\u06cc\u0646 \u0645\u0631\u0628\u0639\u0627\u062a \u062e\u0637\u0627 (RMSE) \u0628\u0631 \u0627\u0633\u0627\u0633 \u062a\u0641\u0627\u0648\u062a \u0628\u06cc\u0646 \u0645\u0642\u0627\u062f\u06cc\u0631 \u0648\u0627\u0642\u0639\u06cc \u0648 \u067e\u06cc\u0634 \u0628\u06cc\u0646\u06cc \u0634\u062f\u0647 \u0645\u062a\u063a\u06cc\u0631 \u0647\u062f\u0641 \u0645\u062d\u0627\u0633\u0628\u0647 \u0645\u06cc \u0634\u0648\u062f\u060c \u0647\u0631 \u0686\u0647 \u0645\u0642\u062f\u0627\u0631 \u06a9\u0645\u062a\u0631 \u0628\u0627\u0634\u062f\u060c \u0628\u0647\u062a\u0631 \u0627\u0633\u062a.<\/p>\n<p>Q6: \u0627\u0631\u0632\u06cc\u0627\u0628\u06cc \u0645\u062f\u0644<\/p>\n<p>\u062f\u0631 \u0627\u06cc\u0646\u062c\u0627\u060c \u062a\u0645\u0627\u0645 \u06a9\u0627\u0631\u0647\u0627\u06cc\u06cc \u06a9\u0647 \u0627\u0646\u062c\u0627\u0645 \u062f\u0627\u062f\u0647\u200c\u0627\u06cc\u0645 \u0631\u0627 \u0628\u0631\u0627\u06cc \u0627\u0639\u062a\u0628\u0627\u0631\u0633\u0646\u062c\u06cc \u0627\u0639\u0645\u0627\u0644 \u0645\u06cc\u200c\u06a9\u0646\u06cc\u0645 \u0641\u0648\u0631\u06cc\u0647 \u0645\u062c\u0645\u0648\u0639\u0647 \u062f\u0627\u062f\u0647\u060c \u0628\u0647 \u0633\u0627\u062f\u06af\u06cc \u0628\u0627 \u0627\u06cc\u062c\u0627\u062f \u06cc\u06a9 \u062a\u0627\u0628\u0639:<\/p>\n<p>## Compile chunks into a function<br \/>\ndef rmse_validation(df_pth: str):<br \/>\n    val_df = pd.read_parquet(df_pth)<br \/>\n    val_df[[&#8220;tpep_pickup_datetime&#8221;, &#8220;tpep_dropoff_datetime&#8221;]] = val_df[[&#8220;tpep_pickup_datetime&#8221;, &#8220;tpep_dropoff_datetime&#8221;]].apply(pd.to_datetime)<br \/>\n    val_df[&#8220;duration&#8221;] = (val_df[&#8220;tpep_dropoff_datetime&#8221;] &#8211; val_df[&#8220;tpep_pickup_datetime&#8221;]).dt.total_seconds()\/60<br \/>\n    val_df = val_df[val_df[&#8216;duration&#8217;].between(1,60)]\n<p>    val_df[[&#8216;PULocationID&#8217;, &#8216;DOLocationID&#8217;]] = val_df[[&#8216;PULocationID&#8217;, &#8216;DOLocationID&#8217;]].astype(str)<br \/>\n    dicts_val = val_df[[&#8216;PULocationID&#8217;, &#8216;DOLocationID&#8217;]].to_dict(orient=&#8221;records&#8221;)<\/p>\n<p>    feature_matrix_val = vec.transform(dicts_val)<br \/>\n    #print(f&#8221;Dimension of feature_matrix: {feature_matrix_val.shape} \\n&#8221;)<\/p>\n<p>    y_val = val_df[&#8216;duration&#8217;]\n    y_pred = model.predict(feature_matrix_val)<br \/>\n    rmse = np.sqrt(mean_squared_error(y_val, y_pred))<\/p>\n<p>    return rmse<\/p>\n<p>result_feb_df = rmse_validation(&#8220;.\/datasets\/feb_yellow.parquet&#8221;)<br \/>\nprint(f&#8221;6, Validation_RMSE: {result_feb_df}&#8221;)<\/p>\n<p>    \u0648\u0627\u0631\u062f \u062d\u0627\u0644\u062a \u062a\u0645\u0627\u0645 \u0635\u0641\u062d\u0647 \u0634\u0648\u06cc\u062f<\/p>\n<p>    \u0627\u0632 \u062d\u0627\u0644\u062a \u062a\u0645\u0627\u0645 \u0635\u0641\u062d\u0647 \u062e\u0627\u0631\u062c \u0634\u0648\u06cc\u062f<\/p>\n<p>=> 6\u060c Validation_RMSE: 7.811812822882009<\/p>\n<p>\u062a\u0646\u0647\u0627 \u062a\u0641\u0627\u0648\u062a \u062f\u0631 \u0627\u06cc\u0646\u062c\u0627 \u062a\u0645\u0627\u06cc\u0632 \u0628\u06cc\u0646 \u0627\u0633\u062a \u062a\u0646\u0627\u0633\u0628_\u062a\u063a\u06cc\u06cc\u0631 \u0648 \u062a\u0628\u062f\u06cc\u0644 \u0647\u0645\u0627\u0646\u0637\u0648\u0631 \u06a9\u0647 \u062f\u0631 \u0645\u0648\u0631\u062f \u0628\u0631\u062f\u0627\u0631 \u0627\u0639\u0645\u0627\u0644 \u0645\u06cc \u0634\u0648\u062f\u060c \u0645\u0627 \u0627\u0632 transform \u062f\u0631 \u0645\u062c\u0645\u0648\u0639\u0647 \u0627\u0639\u062a\u0628\u0627\u0631 \u0633\u0646\u062c\u06cc \u0627\u0633\u062a\u0641\u0627\u062f\u0647 \u0645\u06cc \u06a9\u0646\u06cc\u0645 \u062a\u0627 \u0628\u0647 \u0633\u0627\u062f\u06af\u06cc \u062a\u0628\u062f\u06cc\u0644 \u0628\u0631\u0627\u0632\u0634 \u0631\u0627 \u06a9\u0647 \u0642\u0628\u0644\u0627 \u062f\u0631 \u0645\u062c\u0645\u0648\u0639\u0647 \u0622\u0645\u0648\u0632\u0634\u06cc \u0627\u0646\u062c\u0627\u0645 \u0634\u062f\u0647 \u0627\u0633\u062a \u0628\u0647 \u0627\u0631\u062b \u0628\u0628\u0631\u06cc\u0645.<\/p>\n<p>\u062e\u0648\u062f\u0634\u0647! \u0628\u0631\u0627\u06cc \u0628\u0631\u0631\u0633\u06cc \u06a9\u062f\u0647\u0627 \u0628\u0647 wk1_submission \u0645\u0631\u0627\u062c\u0639\u0647 \u06a9\u0646\u06cc\u062f \u0648 \u0628\u0647 \u0633\u0644\u0627\u0645\u062a\u06cc!\u062f\u0631 \u0635\u0648\u0631\u062a \u0648\u062c\u0648\u062f \u0647\u0631 \u06af\u0648\u0646\u0647 \u0645\u0634\u06a9\u0644 \u062f\u0631 \u0632\u06cc\u0631 \u0646\u0638\u0631 \u062f\u0647\u06cc\u062f.<\/p>\n<div data-article-id=\"1891501\" id=\"article-body\">\n<p>\u0627\u062e\u06cc\u0631\u0627\u064b \u0628\u0647 \u06af\u0631\u0648\u0647 DataTalks 2024 \u0645\u0644\u062d\u0642 \u0634\u062f\u0647 \u0627\u0633\u062a \u062a\u0627 \u0627\u0645\u062a\u06cc\u0627\u0632 \u06a9\u0633\u0628 \u06a9\u0646\u062f <strong>MLOs<\/strong> \u06af\u0648\u0627\u0647\u06cc \u0648 \u0627\u0633\u0627\u0633\u0627\u064b \u0628\u0631 \u0627\u0633\u0627\u0633 \u0634\u0627\u06cc\u0633\u062a\u06af\u06cc \u0647\u0627\u06cc Machine Pipeline \u0633\u0627\u062e\u062a\u0647 \u0634\u062f\u0647 \u0627\u0633\u062a.  \u0628\u0631\u0627\u06cc \u062a\u06a9\u0645\u06cc\u0644 \u062f\u0648\u0631\u0647 \u062a\u06a9\u0627\u0644\u06cc\u0641\u06cc \u0627\u0633\u062a \u06a9\u0647 \u0628\u0627\u06cc\u062f \u06cc\u06a9 \u0647\u0641\u062a\u0647 \u062f\u0631 \u0645\u06cc\u0627\u0646 \u0627\u0646\u062c\u0627\u0645 \u0634\u0648\u062f.<\/p>\n<p>\u0627\u06cc\u0646 \u0645\u062c\u0645\u0648\u0639\u0647 \u0627\u06cc \u0627\u0632 \u0646\u062d\u0648\u0647 \u0628\u0631\u062e\u0648\u0631\u062f \u0646\u0648\u06cc\u0633\u0646\u062f\u0647 \u0628\u0627 \u0627\u06cc\u0646 \u062a\u06a9\u0627\u0644\u06cc\u0641 \u062e\u0648\u0627\u0647\u062f \u0628\u0648\u062f \u0648 \u0628\u0647 \u0639\u0646\u0648\u0627\u0646 \u0631\u0627\u0647 \u062d\u0644\u06cc \u0628\u0631\u0627\u06cc \u06a9\u0633\u0627\u0646\u06cc \u06a9\u0647 \u062f\u0631 \u062d\u0627\u0644 \u0645\u0628\u0627\u0631\u0632\u0647 \u0647\u0633\u062a\u0646\u062f \u0639\u0645\u0644 \u0645\u06cc \u06a9\u0646\u062f.<\/p>\n<p><strong>\u0647\u0641\u062a\u0647 1<\/strong><br \/>\u062a\u06a9\u0644\u06cc\u0641 \u062f\u0631 \u0627\u06cc\u0646\u062c\u0627 \u0627\u0633\u0627\u0633\u06cc \u0627\u0633\u062a\u060c \u0634\u0645\u0627 \u0628\u0627\u06cc\u062f \u0645\u0647\u0627\u0631\u062a \u0647\u0627\u06cc \u0644\u0627\u0632\u0645 \u0631\u0627 \u062f\u0631 \u067e\u0627\u06cc\u062a\u0648\u0646\u060c \u06a9\u062a\u0627\u0628\u062e\u0627\u0646\u0647 \u0647\u0627\u06cc ml \u0648 \u0627\u0633\u06a9\u0631\u06cc\u067e\u062a \u0646\u0648\u06cc\u0633\u06cc bash \u062f\u0627\u0634\u062a\u0647 \u0628\u0627\u0634\u06cc\u062f \u062a\u0627 \u0627\u06cc\u0646 \u06a9\u0627\u0631 \u0631\u0627 \u06a9\u0627\u0645\u0644 \u0628\u0628\u06cc\u0646\u06cc\u062f.  \u062a\u06a9\u0627\u0644\u06cc\u0641 \u0632\u06cc\u0631 \u0631\u0627 \u0628\u0628\u06cc\u0646\u06cc\u062f:<\/p>\n<p><\/p>\n<p>\u06cc\u06a9 \u0631\u0627 \u062e\u0648\u0627\u0647\u06cc\u0645 \u0633\u0627\u062e\u062a <strong>\u0646\u0648\u062a \u0628\u0648\u06a9 \u0698\u0648\u067e\u06cc\u062a\u0631<\/strong> \u06a9\u0647 \u0628\u0647 \u0647\u0631 \u0633\u0648\u0627\u0644 \u0645\u06cc \u067e\u0631\u062f\u0627\u0632\u062f. <\/p>\n<p>\u0642\u0628\u0644 \u0627\u0632 \u0647\u0631 \u0686\u06cc\u0632\u06cc\u060c \u06cc\u06a9 \u062f\u0627\u06cc\u0631\u06a9\u062a\u0648\u0631\u06cc \u0627\u06cc\u062c\u0627\u062f \u06a9\u0646\u06cc\u062f \u062a\u0627 \u06a9\u0627\u0631\u0647\u0627\u06cc \u062e\u0648\u062f \u0631\u0627 \u062f\u0631 \u062d\u0627\u0644 \u062d\u0627\u0636\u0631 \u0648 \u0628\u0639\u062f\u0627\u064b \u062f\u0631 \u062e\u0648\u062f \u062c\u0627\u06cc \u062f\u0647\u062f\u060c \u0645\u0627\u0646\u0646\u062f:<\/p>\n<div class=\"highlight js-code-highlight\">\n<pre class=\"highlight plaintext\"><code>MLOPS \n |\n - wk1\n<\/code><\/pre>\n<div class=\"highlight__panel js-actions-panel\">\n<div class=\"highlight__panel-action js-fullscreen-code-action\">\n    <svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"20px\" height=\"20px\" viewbox=\"0 0 24 24\" class=\"highlight-action crayons-icon highlight-action--fullscreen-on\"><title>\u0648\u0627\u0631\u062f \u062d\u0627\u0644\u062a \u062a\u0645\u0627\u0645 \u0635\u0641\u062d\u0647 \u0634\u0648\u06cc\u062f<\/title>\n    <path d=\"M16 3h6v6h-2V5h-4V3zM2 3h6v2H4v4H2V3zm18 16v-4h2v6h-6v-2h4zM4 19h4v2H2v-6h2v4z\"\/>\n<\/svg><\/p>\n<p>    <svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"20px\" height=\"20px\" viewbox=\"0 0 24 24\" class=\"highlight-action crayons-icon highlight-action--fullscreen-off\"><title>\u0627\u0632 \u062d\u0627\u0644\u062a \u062a\u0645\u0627\u0645 \u0635\u0641\u062d\u0647 \u062e\u0627\u0631\u062c \u0634\u0648\u06cc\u062f<\/title>\n    <path d=\"M18 7h4v2h-6V3h2v4zM8 9H2V7h4V3h2v6zm10 8v4h-2v-6h6v2h-4zM8 15v6H6v-4H2v-2h6z\"\/>\n<\/svg><\/p>\n<\/div>\n<\/div>\n<\/div>\n<p>\u0633\u067e\u0633 \u06cc\u06a9 \u0645\u062d\u06cc\u0637 \u0645\u062c\u0627\u0632\u06cc \u062f\u0631 \u062f\u0627\u06cc\u0631\u06a9\u062a\u0648\u0631\u06cc \u0648\u0627\u0644\u062f \u0627\u06cc\u062c\u0627\u062f \u06a9\u0646\u06cc\u062f\u060c \u0627\u06cc\u0646 \u062c\u0627\u06cc\u06cc \u0627\u0633\u062a \u06a9\u0647 \u062a\u0645\u0627\u0645 \u0628\u0633\u062a\u0647 \u0647\u0627\u06cc \u0645\u0648\u0631\u062f \u0646\u06cc\u0627\u0632 \u0628\u0631\u0627\u06cc \u06a9\u0644 \u0633\u0641\u0631 \u0631\u0627 \u0646\u0635\u0628 \u0645\u06cc \u06a9\u0646\u06cc\u062f:<\/p>\n<ol>\n<li>bash \u062e\u0648\u062f \u0631\u0627 \u0631\u0627\u0647 \u0627\u0646\u062f\u0627\u0632\u06cc \u06a9\u0646\u06cc\u062f \u0648 \u062f\u0633\u062a\u0648\u0631\u0627\u062a \u0632\u06cc\u0631 \u0631\u0627 \u0627\u062c\u0631\u0627 \u06a9\u0646\u06cc\u062f:\n<\/li>\n<\/ol>\n<div class=\"highlight js-code-highlight\">\n<pre class=\"highlight plaintext\"><code>cd MLOPS\npython3.10 -m venv MLOPS_venv\nsource MLOPS_venv\/Scripts\/activate\n<\/code><\/pre>\n<div class=\"highlight__panel js-actions-panel\">\n<div class=\"highlight__panel-action js-fullscreen-code-action\">\n    <svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"20px\" height=\"20px\" viewbox=\"0 0 24 24\" class=\"highlight-action crayons-icon highlight-action--fullscreen-on\"><title>\u0648\u0627\u0631\u062f \u062d\u0627\u0644\u062a \u062a\u0645\u0627\u0645 \u0635\u0641\u062d\u0647 \u0634\u0648\u06cc\u062f<\/title>\n    <path d=\"M16 3h6v6h-2V5h-4V3zM2 3h6v2H4v4H2V3zm18 16v-4h2v6h-6v-2h4zM4 19h4v2H2v-6h2v4z\"\/>\n<\/svg><\/p>\n<p>    <svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"20px\" height=\"20px\" viewbox=\"0 0 24 24\" class=\"highlight-action crayons-icon highlight-action--fullscreen-off\"><title>\u0627\u0632 \u062d\u0627\u0644\u062a \u062a\u0645\u0627\u0645 \u0635\u0641\u062d\u0647 \u062e\u0627\u0631\u062c \u0634\u0648\u06cc\u062f<\/title>\n    <path d=\"M18 7h4v2h-6V3h2v4zM8 9H2V7h4V3h2v6zm10 8v4h-2v-6h6v2h-4zM8 15v6H6v-4H2v-2h6z\"\/>\n<\/svg><\/p>\n<\/div>\n<\/div>\n<\/div>\n<p>\u0627\u06cc\u0646 \u0628\u0627\u0639\u062b \u0627\u06cc\u062c\u0627\u062f \u0645\u062d\u06cc\u0637 \u0645\u062c\u0627\u0632\u06cc \u0645\u06cc \u0634\u0648\u062f <strong>MLOPS_venv<\/strong> \u0628\u0627 Python 3.10 \u0648 \u0647\u0645\u0686\u0646\u06cc\u0646 \u062f\u0627\u06cc\u0631\u06a9\u062a\u0648\u0631\u06cc \u062f\u0631 \u067e\u0648\u0634\u0647 \u0648\u0627\u0644\u062f \u062e\u0648\u062f \u0628\u0627 \u0647\u0645\u06cc\u0646 \u0646\u0627\u0645\u060c \u0627\u06a9\u0646\u0648\u0646 \u0645\u06cc \u062a\u0648\u0627\u0646\u06cc\u062f \u0628\u0633\u062a\u0647 \u0647\u0627 \u0631\u0627 \u062f\u0631 \u0627\u06cc\u0646 \u0645\u062d\u06cc\u0637 \u0646\u0635\u0628 \u06a9\u0646\u06cc\u062f.  \u062e\u0637 \u0622\u062e\u0631 \u0641\u0639\u0627\u0644 \u06a9\u0631\u062f\u0646 \u0627\u06cc\u0646 \u0645\u062d\u06cc\u0637 \u0627\u0633\u062a.<\/p>\n<p>\u0648 \u0628\u0631\u0627\u06cc \u063a\u06cc\u0631 \u0641\u0639\u0627\u0644 \u06a9\u0631\u062f\u0646:<br \/><code>deactivate<\/code><\/p>\n<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_84 counter-hierarchy ez-toc-counter-rtl ez-toc-grey ez-toc-container-direction\">\n<div class=\"ez-toc-title-container\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">\u0641\u0647\u0631\u0633\u062a \u0645\u0637\u0627\u0644\u0628<\/p>\n<span class=\"ez-toc-title-toggle\"><a href=\"#\" class=\"ez-toc-pull-right ez-toc-btn ez-toc-btn-xs ez-toc-btn-default ez-toc-toggle\" aria-label=\"Toggle Table of Content\"><span class=\"ez-toc-js-icon-con\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #999;color:#999\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewBox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #999;color:#999\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewBox=\"0 0 24 24\" version=\"1.2\" baseProfile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/span><\/a><\/span><\/div>\n<nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-4'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/nabfollower.com\/blog\/wk-1-mlops-with-datatalks-5ah5\/#Wk1\" >Wk1:<\/a><ul class='ez-toc-list-level-6' ><li class='ez-toc-heading-level-6'><ul class='ez-toc-list-level-6' ><li class='ez-toc-heading-level-6'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/nabfollower.com\/blog\/wk-1-mlops-with-datatalks-5ah5\/#%D8%A8%D8%B1%D9%BE%D8%A7%DB%8C%DB%8C\" >\u0628\u0631\u067e\u0627\u06cc\u06cc<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-6'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/nabfollower.com\/blog\/wk-1-mlops-with-datatalks-5ah5\/#%D9%86%D9%88%D8%AA_%D8%A8%D9%88%DA%A9_%DA%98%D9%88%D9%BE%DB%8C%D8%AA%D8%B1\" >\u0646\u0648\u062a \u0628\u0648\u06a9 \u0698\u0648\u067e\u06cc\u062a\u0631<\/a><\/li><\/ul><\/li><\/ul><\/li><\/ul><\/nav><\/div>\n<h4><span class=\"ez-toc-section\" id=\"Wk1\"><\/span>\n<p>  Wk1:<br \/>\n<span class=\"ez-toc-section-end\"><\/span><\/h4>\n<h6><span class=\"ez-toc-section\" id=\"%D8%A8%D8%B1%D9%BE%D8%A7%DB%8C%DB%8C\"><\/span>\n<p>  \u0628\u0631\u067e\u0627\u06cc\u06cc<br \/>\n<span class=\"ez-toc-section-end\"><\/span><\/h6>\n<div class=\"highlight js-code-highlight\">\n<pre class=\"highlight plaintext\"><code>mkdir wk1\ncd wk1\nmkdir datasets\ncode .\n<\/code><\/pre>\n<div class=\"highlight__panel js-actions-panel\">\n<div class=\"highlight__panel-action js-fullscreen-code-action\">\n    <svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"20px\" height=\"20px\" viewbox=\"0 0 24 24\" class=\"highlight-action crayons-icon highlight-action--fullscreen-on\"><title>\u0648\u0627\u0631\u062f \u062d\u0627\u0644\u062a \u062a\u0645\u0627\u0645 \u0635\u0641\u062d\u0647 \u0634\u0648\u06cc\u062f<\/title>\n    <path d=\"M16 3h6v6h-2V5h-4V3zM2 3h6v2H4v4H2V3zm18 16v-4h2v6h-6v-2h4zM4 19h4v2H2v-6h2v4z\"\/>\n<\/svg><\/p>\n<p>    <svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"20px\" height=\"20px\" viewbox=\"0 0 24 24\" class=\"highlight-action crayons-icon highlight-action--fullscreen-off\"><title>\u0627\u0632 \u062d\u0627\u0644\u062a \u062a\u0645\u0627\u0645 \u0635\u0641\u062d\u0647 \u062e\u0627\u0631\u062c \u0634\u0648\u06cc\u062f<\/title>\n    <path d=\"M18 7h4v2h-6V3h2v4zM8 9H2V7h4V3h2v6zm10 8v4h-2v-6h6v2h-4zM8 15v6H6v-4H2v-2h6z\"\/>\n<\/svg><\/p>\n<\/div>\n<\/div>\n<\/div>\n<p>\u0627\u06cc\u0646 \u0628\u0627\u0639\u062b \u0645\u06cc \u0634\u0648\u062f <strong>wk1<\/strong> \u062f\u0627\u06cc\u0631\u06a9\u062a\u0648\u0631\u06cc \u0628\u0647 \u06a9\u0627\u0631 \u0627\u06cc\u0646 \u0647\u0641\u062a\u0647\u060c \u0627\u06af\u0631 \u0642\u0628\u0644\u0627\u064b \u0622\u0646 \u0631\u0627 \u0627\u06cc\u062c\u0627\u062f \u0646\u06a9\u0631\u062f\u0647\u200c\u0627\u06cc\u062f\u060c \u0628\u0631\u0627\u06cc \u0627\u06cc\u062c\u0627\u062f \u06cc\u06a9 \u0628\u0647 \u062f\u0627\u062e\u0644 \u0622\u0646 \u067e\u06cc\u0645\u0627\u06cc\u0634 \u06a9\u0646\u06cc\u062f <em>\u0645\u062c\u0645\u0648\u0639\u0647 \u062f\u0627\u062f\u0647 \u0647\u0627<\/em> \u0632\u06cc\u0631 \u0634\u0627\u062e\u0647 \u0627\u06cc \u06a9\u0647 \u067e\u0633 \u0627\u0632 \u0622\u0646 \u06a9\u062f VS \u0631\u0627 \u0631\u0627\u0647 \u0627\u0646\u062f\u0627\u0632\u06cc \u0645\u06cc \u06a9\u0646\u062f\u060c <strong>Ctrl+Shift+P<\/strong> \u062f\u0633\u062a\u0648\u0631 \u0627\u06cc\u062c\u0627\u062f \u06cc\u06a9 \u0646\u0648\u062a \u0628\u0648\u06a9\u060c \u0646\u0627\u0645 \u0622\u0646 \u0627\u0633\u062a <em>\u0645\u0634\u0642 \u0634\u0628<\/em>.<\/p>\n<p>\u0647\u0646\u06af\u0627\u0645\u06cc \u06a9\u0647 \u0627\u06cc\u0646 \u0645\u0648\u0631\u062f \u0627\u06cc\u062c\u0627\u062f \u0634\u062f\u060c \u0645\u0637\u0645\u0626\u0646 \u0634\u0648\u06cc\u062f \u06a9\u0647 \u0647\u0633\u062a\u0647 \u0631\u0627 \u0631\u0648\u06cc the \u062a\u0646\u0638\u06cc\u0645 \u06a9\u0631\u062f\u0647 \u0627\u06cc\u062f <em>MLOPS_venv<\/em> \u0645\u062d\u06cc\u0637. <\/p>\n<h6><span class=\"ez-toc-section\" id=\"%D9%86%D9%88%D8%AA_%D8%A8%D9%88%DA%A9_%DA%98%D9%88%D9%BE%DB%8C%D8%AA%D8%B1\"><\/span>\n<p>  \u0646\u0648\u062a \u0628\u0648\u06a9 \u0698\u0648\u067e\u06cc\u062a\u0631<br \/>\n<span class=\"ez-toc-section-end\"><\/span><\/h6>\n<p>\u062f\u0631 \u0634\u0645\u0627 <em>\u062a\u06a9\u0627\u0644\u06cc\u0641.ipynb<\/em> \u0641\u0627\u06cc\u0644 \u0646\u0648\u062a \u0628\u0648\u06a9 \u0627\u062c\u0631\u0627 \u06a9\u0646\u06cc\u062f <code>!ls<\/code> \u0628\u0631\u0627\u06cc \u0627\u06cc\u0646\u06a9\u0647 \u0628\u0628\u06cc\u0646\u06cc\u062f \u062f\u0627\u06cc\u0631\u06a9\u062a\u0648\u0631\u06cc \u0647\u0627\u06cc \u0645\u0648\u0631\u062f \u0646\u06cc\u0627\u0632 \u0631\u0627 \u062f\u0627\u0631\u06cc\u062f\u060c \u0628\u0627\u06cc\u062f \u0628\u0647 \u0634\u06a9\u0644 \u0632\u06cc\u0631 \u0628\u0627\u0634\u062f:<\/p>\n<p>\u0633\u067e\u0633 \u0686\u0646\u062f \u06a9\u062a\u0627\u0628\u062e\u0627\u0646\u0647 \u0645\u0627\u0646\u0646\u062f \u0627\u06cc\u0646 \u0631\u0627 \u0646\u0635\u0628 \u06a9\u0646\u06cc\u062f:<br \/><code>## Install Packages<br \/>!pip install numpy pandas seaborn scikit-learn<\/code><\/p>\n<p><strong>!<\/strong> &#8211; \u0627\u06cc\u0646 \u062f\u0631 \u0646\u0648\u062a \u0628\u0648\u06a9 \u0647\u0627\u06cc Jupyter \u0628\u0631\u0627\u06cc \u0627\u062c\u0631\u0627\u06cc \u062f\u0633\u062a\u0648\u0631\u0627\u062a \u067e\u0648\u0633\u062a\u0647 \u0627\u0633\u062a\u0641\u0627\u062f\u0647 \u0645\u06cc \u0634\u0648\u062f.<\/p>\n<p><strong>Q1: \u062a\u0627\u06a9\u0633\u06cc \u0647\u0627\u06cc \u0633\u0628\u0632 &#8211; \u062f\u0627\u062f\u0647 \u0647\u0627\u06cc \u0698\u0627\u0646\u0648\u06cc\u0647 \u0648 \u0641\u0648\u0631\u06cc\u0647 2023 \u0631\u0627 \u0628\u0627\u0631\u06af\u06cc\u0631\u06cc \u06a9\u0646\u06cc\u062f.<\/strong><br \/>1.1 \u062f\u0627\u0646\u0644\u0648\u062f \u0645\u062c\u0645\u0648\u0639\u0647 \u062f\u0627\u062f\u0647 \u0647\u0627<\/p>\n<div class=\"highlight js-code-highlight\">\n<pre class=\"highlight plaintext\"><code>## Download Yellow Taxi Trips Files\n! curl -o .\/datasets\/jan_yellow.parquet https:\/\/d37ci6vzurychx.cloudfront.net\/trip-data\/yellow_tripdata_2023-01.parquet\n! curl -o .\/datasets\/feb_yellow.parquet https:\/\/d37ci6vzurychx.cloudfront.net\/trip-data\/yellow_tripdata_2023-02.parquet\n<\/code><\/pre>\n<div class=\"highlight__panel js-actions-panel\">\n<div class=\"highlight__panel-action js-fullscreen-code-action\">\n    <svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"20px\" height=\"20px\" viewbox=\"0 0 24 24\" class=\"highlight-action crayons-icon highlight-action--fullscreen-on\"><title>\u0648\u0627\u0631\u062f \u062d\u0627\u0644\u062a \u062a\u0645\u0627\u0645 \u0635\u0641\u062d\u0647 \u0634\u0648\u06cc\u062f<\/title>\n    <path d=\"M16 3h6v6h-2V5h-4V3zM2 3h6v2H4v4H2V3zm18 16v-4h2v6h-6v-2h4zM4 19h4v2H2v-6h2v4z\"\/>\n<\/svg><\/p>\n<p>    <svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"20px\" height=\"20px\" viewbox=\"0 0 24 24\" class=\"highlight-action crayons-icon highlight-action--fullscreen-off\"><title>\u0627\u0632 \u062d\u0627\u0644\u062a \u062a\u0645\u0627\u0645 \u0635\u0641\u062d\u0647 \u062e\u0627\u0631\u062c \u0634\u0648\u06cc\u062f<\/title>\n    <path d=\"M18 7h4v2h-6V3h2v4zM8 9H2V7h4V3h2v6zm10 8v4h-2v-6h6v2h-4zM8 15v6H6v-4H2v-2h6z\"\/>\n<\/svg><\/p>\n<\/div>\n<\/div>\n<\/div>\n<p><strong>\u062d\u0644\u0642\u0647<\/strong> \u0627\u0628\u0632\u0627\u0631\u06cc \u0628\u0631\u0627\u06cc \u0627\u0646\u062a\u0642\u0627\u0644 \u062f\u0627\u062f\u0647 \u0647\u0627 \u0627\u0632 \u06cc\u0627 \u0628\u0647 \u0633\u0631\u0648\u0631 \u0627\u0633\u062a.  \u062f\u0631 \u0627\u06cc\u0646\u062c\u0627 \u0686\u06cc\u0632\u06cc \u0627\u0633\u062a \u06a9\u0647 \u0647\u0631 \u0628\u062e\u0634 \u0627\u0646\u062c\u0627\u0645 \u0645\u06cc \u062f\u0647\u062f:<\/p>\n<p><code>curl<\/code>  &#8211; \u0627\u0628\u0632\u0627\u0631 \u062e\u0637 \u0641\u0631\u0645\u0627\u0646 \u0628\u0631\u0627\u06cc \u062f\u0631\u062e\u0648\u0627\u0633\u062a \u0628\u0647 URL \u0647\u0627.<br \/><code>-o<\/code>&#8211; \u0627\u06cc\u0646 \u06af\u0632\u06cc\u0646\u0647 \u0628\u0647 curl \u0645\u06cc \u06af\u0648\u06cc\u062f \u06a9\u0647 \u062e\u0631\u0648\u062c\u06cc \u0631\u0627 \u0628\u0647 \u062c\u0627\u06cc \u0646\u0645\u0627\u06cc\u0634 \u062f\u0631 \u06cc\u06a9 \u0641\u0627\u06cc\u0644 \u0630\u062e\u06cc\u0631\u0647 \u06a9\u0646\u062f.<br \/><code>.\/datasets\/jan_yellow.parquet<\/code>  &#8211; \u0645\u0633\u06cc\u0631\u06cc \u06a9\u0647 \u0627\u0648\u0644\u06cc\u0646 \u0641\u0627\u06cc\u0644 \u062f\u0631 \u0622\u0646 \u0630\u062e\u06cc\u0631\u0647 \u062e\u0648\u0627\u0647\u062f \u0634\u062f.<\/p>\n<p>\u0628\u0646\u0627\u0628\u0631\u0627\u06cc\u0646\u060c \u0627\u0648\u0644\u06cc\u0646 \u062f\u0633\u062a\u0648\u0631 \u06cc\u06a9 \u0641\u0627\u06cc\u0644 \u0628\u0647 \u0646\u0627\u0645 \u0631\u0627 \u062f\u0627\u0646\u0644\u0648\u062f \u0645\u06cc \u06a9\u0646\u062f <em>yellow_tripdata_2023-01.parquet<\/em> \u0627\u0632 URL \u062f\u0627\u062f\u0647 \u0634\u062f\u0647 \u0648 \u0622\u0646 \u0631\u0627 \u0628\u0647 \u0639\u0646\u0648\u0627\u0646 \u0630\u062e\u06cc\u0631\u0647 \u0645\u06cc \u06a9\u0646\u062f <em>jan_yellow.\u067e\u0627\u0631\u06a9\u062a<\/em> \u062f\u0631 <code>datasets<\/code> \u0641\u0647\u0631\u0633\u062a \u0631\u0627\u0647\u0646\u0645\u0627.<\/p>\n<p>\u062f\u0633\u062a\u0648\u0631 \u062f\u0648\u0645 \u0647\u0645\u06cc\u0646 \u06a9\u0627\u0631 \u0631\u0627 \u0628\u0631\u0627\u06cc \u0641\u0627\u06cc\u0644 \u062f\u06cc\u06af\u0631\u06cc \u0627\u0646\u062c\u0627\u0645 \u0645\u06cc \u062f\u0647\u062f \u0648 \u0622\u0646 \u0631\u0627 \u0628\u0647 \u0639\u0646\u0648\u0627\u0646 \u0630\u062e\u06cc\u0631\u0647 \u0645\u06cc \u06a9\u0646\u062f <em>feb_yellow.\u067e\u0627\u0631\u06a9\u062a<\/em>.<\/p>\n<p>1.2 \u0648\u0627\u0631\u062f\u0627\u062a \u06a9\u062a\u0627\u0628\u062e\u0627\u0646\u0647 \u0647\u0627<\/p>\n<div class=\"highlight js-code-highlight\">\n<pre class=\"highlight plaintext\"><code>## Load Libraries\nimport numpy as np\nimport pandas as pd\nfrom sklearn.feature_extraction import DictVectorizer\n\n## Load Dataset\njan_df = pd.read_parquet(\".\/datasets\/jan_yellow.parquet\")\nprint(f\"1, Data Dimension: {jan_df.shape[0]} rows | {jan_df.shape[1]} columns \\n\")\n<\/code><\/pre>\n<div class=\"highlight__panel js-actions-panel\">\n<div class=\"highlight__panel-action js-fullscreen-code-action\">\n    <svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"20px\" height=\"20px\" viewbox=\"0 0 24 24\" class=\"highlight-action crayons-icon highlight-action--fullscreen-on\"><title>\u0648\u0627\u0631\u062f \u062d\u0627\u0644\u062a \u062a\u0645\u0627\u0645 \u0635\u0641\u062d\u0647 \u0634\u0648\u06cc\u062f<\/title>\n    <path d=\"M16 3h6v6h-2V5h-4V3zM2 3h6v2H4v4H2V3zm18 16v-4h2v6h-6v-2h4zM4 19h4v2H2v-6h2v4z\"\/>\n<\/svg><\/p>\n<p>    <svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"20px\" height=\"20px\" viewbox=\"0 0 24 24\" class=\"highlight-action crayons-icon highlight-action--fullscreen-off\"><title>\u0627\u0632 \u062d\u0627\u0644\u062a \u062a\u0645\u0627\u0645 \u0635\u0641\u062d\u0647 \u062e\u0627\u0631\u062c \u0634\u0648\u06cc\u062f<\/title>\n    <path d=\"M18 7h4v2h-6V3h2v4zM8 9H2V7h4V3h2v6zm10 8v4h-2v-6h6v2h-4zM8 15v6H6v-4H2v-2h6z\"\/>\n<\/svg><\/p>\n<\/div>\n<\/div>\n<\/div>\n<p>=> \u0627\u0628\u0639\u0627\u062f \u062f\u0627\u062f\u0647: 3066766 \u0631\u062f\u06cc\u0641 |  20 \u0633\u062a\u0648\u0646 <br \/>\u062e\u0631\u0648\u062c\u06cc \u0627\u06cc\u0646 \u067e\u0627\u0633\u062e \u0633\u0648\u0627\u0644 \u0631\u0627 \u0628\u0631\u0645\u06cc \u06af\u0631\u062f\u0627\u0646\u062f.<\/p>\n<p><strong>Q2: \u0645\u062a\u063a\u06cc\u0631 \u0645\u062f\u062a \u0632\u0645\u0627\u0646 (\u0628\u0631 \u062d\u0633\u0628 \u062f\u0642\u06cc\u0642\u0647) \u0631\u0627 \u0645\u062d\u0627\u0633\u0628\u0647 \u06a9\u0646\u06cc\u062f \u0648 \u0627\u0646\u062d\u0631\u0627\u0641 \u0627\u0633\u062a\u0627\u0646\u062f\u0627\u0631\u062f \u0645\u062f\u062a \u0633\u0641\u0631 \u062f\u0631 \u0698\u0627\u0646\u0648\u06cc\u0647 \u0631\u0627 \u0648\u0627\u06a9\u0634\u06cc \u06a9\u0646\u06cc\u062f\u061f<\/strong><\/p>\n<p>2.1 \u0645\u062d\u0627\u0633\u0628\u0647 \u0645\u062f\u062a \u0633\u0641\u0631 \u0648 \u0627\u0646\u062d\u0631\u0627\u0641 Std<\/p>\n<div class=\"highlight js-code-highlight\">\n<pre class=\"highlight plaintext\"><code>jan_df[[\"tpep_pickup_datetime\", \"tpep_dropoff_datetime\"]] = jan_df[[\"tpep_pickup_datetime\", \"tpep_dropoff_datetime\"]].apply(pd.to_datetime)\njan_df[\"duration\"] = (jan_df[\"tpep_dropoff_datetime\"] - jan_df[\"tpep_pickup_datetime\"]).dt.total_seconds()\/60\n\nprint(f\"2, Duration Standard Deviation: {jan_df['duration'].std()} \\n\")\n<\/code><\/pre>\n<div class=\"highlight__panel js-actions-panel\">\n<div class=\"highlight__panel-action js-fullscreen-code-action\">\n    <svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"20px\" height=\"20px\" viewbox=\"0 0 24 24\" class=\"highlight-action crayons-icon highlight-action--fullscreen-on\"><title>\u0648\u0627\u0631\u062f \u062d\u0627\u0644\u062a \u062a\u0645\u0627\u0645 \u0635\u0641\u062d\u0647 \u0634\u0648\u06cc\u062f<\/title>\n    <path d=\"M16 3h6v6h-2V5h-4V3zM2 3h6v2H4v4H2V3zm18 16v-4h2v6h-6v-2h4zM4 19h4v2H2v-6h2v4z\"\/>\n<\/svg><\/p>\n<p>    <svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"20px\" height=\"20px\" viewbox=\"0 0 24 24\" class=\"highlight-action crayons-icon highlight-action--fullscreen-off\"><title>\u0627\u0632 \u062d\u0627\u0644\u062a \u062a\u0645\u0627\u0645 \u0635\u0641\u062d\u0647 \u062e\u0627\u0631\u062c \u0634\u0648\u06cc\u062f<\/title>\n    <path d=\"M18 7h4v2h-6V3h2v4zM8 9H2V7h4V3h2v6zm10 8v4h-2v-6h6v2h-4zM8 15v6H6v-4H2v-2h6z\"\/>\n<\/svg><\/p>\n<\/div>\n<\/div>\n<\/div>\n<p>=> \u0627\u0646\u062d\u0631\u0627\u0641 \u0627\u0633\u062a\u0627\u0646\u062f\u0627\u0631\u062f \u0645\u062f\u062a \u0632\u0645\u0627\u0646: 42.59435124195458.<\/p>\n<p>\u0627\u06cc\u0646 \u06a9\u062f \u0633\u062a\u0648\u0646 \u0647\u0627\u06cc &#8220;tpep_pickup_datetime&#8221; \u0648 &#8220;tpep_dropoff_datetime&#8221; \u0631\u0627 \u062f\u0631 jan_df DataFrame \u0628\u0627 \u0627\u0633\u062a\u0641\u0627\u062f\u0647 \u0627\u0632 \u062a\u0627\u0628\u0639 to_datetime pandas \u0628\u0647 \u0627\u0634\u06cc\u0627\u0621 datetime \u062a\u0628\u062f\u06cc\u0644 \u0645\u06cc \u06a9\u0646\u062f.  \u0633\u067e\u0633\u060c \u0645\u062f\u062a \u0632\u0645\u0627\u0646 \u0647\u0631 \u0633\u0641\u0631 \u0631\u0627 \u0628\u0627 \u06a9\u0645 \u06a9\u0631\u062f\u0646 \u0632\u0645\u0627\u0646 \u062a\u062d\u0648\u06cc\u0644 \u0627\u0632 \u0632\u0645\u0627\u0646 \u062a\u062e\u0644\u06cc\u0647\u060c \u062a\u0628\u062f\u06cc\u0644 \u0646\u062a\u06cc\u062c\u0647 \u0628\u0647 \u062b\u0627\u0646\u06cc\u0647 \u06a9\u0644 \u0648 \u0633\u067e\u0633 \u062a\u0642\u0633\u06cc\u0645 \u0628\u0631 60 \u0628\u0631\u0627\u06cc \u0628\u062f\u0633\u062a \u0622\u0648\u0631\u062f\u0646 \u0645\u062f\u062a \u0632\u0645\u0627\u0646 \u0628\u0631 \u062d\u0633\u0628 \u062f\u0642\u06cc\u0642\u0647 \u0645\u062d\u0627\u0633\u0628\u0647 \u0645\u06cc \u06a9\u0646\u062f.<\/p>\n<p><strong>Q3: \u062d\u0630\u0641 Outliers<\/strong><\/p>\n<div class=\"highlight js-code-highlight\">\n<pre class=\"highlight plaintext\"><code>filtered_duration = jan_df[jan_df['duration'].between(1,60)]\nclean_prop = len(filtered_duration['duration'])\/len(jan_df['duration'])\n\nprint(f\"3, Outlier Proportion: {clean_prop} \\n\")\n<\/code><\/pre>\n<div class=\"highlight__panel js-actions-panel\">\n<div class=\"highlight__panel-action js-fullscreen-code-action\">\n    <svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"20px\" height=\"20px\" viewbox=\"0 0 24 24\" class=\"highlight-action crayons-icon highlight-action--fullscreen-on\"><title>\u0648\u0627\u0631\u062f \u062d\u0627\u0644\u062a \u062a\u0645\u0627\u0645 \u0635\u0641\u062d\u0647 \u0634\u0648\u06cc\u062f<\/title>\n    <path d=\"M16 3h6v6h-2V5h-4V3zM2 3h6v2H4v4H2V3zm18 16v-4h2v6h-6v-2h4zM4 19h4v2H2v-6h2v4z\"\/>\n<\/svg><\/p>\n<p>    <svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"20px\" height=\"20px\" viewbox=\"0 0 24 24\" class=\"highlight-action crayons-icon highlight-action--fullscreen-off\"><title>\u0627\u0632 \u062d\u0627\u0644\u062a \u062a\u0645\u0627\u0645 \u0635\u0641\u062d\u0647 \u062e\u0627\u0631\u062c \u0634\u0648\u06cc\u062f<\/title>\n    <path d=\"M18 7h4v2h-6V3h2v4zM8 9H2V7h4V3h2v6zm10 8v4h-2v-6h6v2h-4zM8 15v6H6v-4H2v-2h6z\"\/>\n<\/svg><\/p>\n<\/div>\n<\/div>\n<\/div>\n<p>~> 98\u066a<br \/>\u0641\u06cc\u0644\u062a\u0631 \u0645\u06cc \u06a9\u0646\u06cc\u0645 <code>jan_df<\/code> \u0641\u0642\u0637 \u0633\u0637\u0631\u0647\u0627\u06cc\u06cc \u0631\u0627 \u0634\u0627\u0645\u0644 \u0634\u0648\u062f \u06a9\u0647 \u0645\u0642\u0627\u062f\u06cc\u0631 \u0633\u062a\u0648\u0646 &#8220;\u0645\u062f\u062a&#8221; \u0628\u06cc\u0646 1 \u062a\u0627 60 \u062f\u0642\u06cc\u0642\u0647 \u0627\u0633\u062a.  \u0627\u06cc\u0646 \u062d\u062f\u0648\u062f 98 \u062f\u0631\u0635\u062f \u0627\u0632 \u062f\u06cc\u062a\u0627\u0641\u0631\u06cc\u0645 \u0627\u0648\u0644\u06cc\u0647 \u0631\u0627 \u062a\u0634\u06a9\u06cc\u0644 \u0645\u06cc \u062f\u0647\u062f.<\/p>\n<p><strong>Q4: \u0627\u0628\u0639\u0627\u062f\u06cc \u0628\u0648\u062f\u0646 \u0645\u0627\u062a\u0631\u06cc\u0633 \u0648\u06cc\u0698\u06af\u06cc<\/strong><\/p>\n<div class=\"highlight js-code-highlight\">\n<pre class=\"highlight plaintext\"><code>## Filtered columns\nml_df = filtered_duration[['PULocationID', 'DOLocationID']].astype(str)\nml_df['duration'] = filtered_duration['duration']\n\n## Dictionaries\ndicts_train = ml_df[['PULocationID', 'DOLocationID']].to_dict(orient=\"records\")\ndicts_train[1:5]\n\n## Vectorizers\nvec = DictVectorizer(sparse = True)\nfeature_matrix = vec.fit_transform(dicts_train)\n\nprint(f\"4, Dimension of feature_matrix: {feature_matrix.shape} \\n\")\n<\/code><\/pre>\n<div class=\"highlight__panel js-actions-panel\">\n<div class=\"highlight__panel-action js-fullscreen-code-action\">\n    <svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"20px\" height=\"20px\" viewbox=\"0 0 24 24\" class=\"highlight-action crayons-icon highlight-action--fullscreen-on\"><title>\u0648\u0627\u0631\u062f \u062d\u0627\u0644\u062a \u062a\u0645\u0627\u0645 \u0635\u0641\u062d\u0647 \u0634\u0648\u06cc\u062f<\/title>\n    <path d=\"M16 3h6v6h-2V5h-4V3zM2 3h6v2H4v4H2V3zm18 16v-4h2v6h-6v-2h4zM4 19h4v2H2v-6h2v4z\"\/>\n<\/svg><\/p>\n<p>    <svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"20px\" height=\"20px\" viewbox=\"0 0 24 24\" class=\"highlight-action crayons-icon highlight-action--fullscreen-off\"><title>\u0627\u0632 \u062d\u0627\u0644\u062a \u062a\u0645\u0627\u0645 \u0635\u0641\u062d\u0647 \u062e\u0627\u0631\u062c \u0634\u0648\u06cc\u062f<\/title>\n    <path d=\"M18 7h4v2h-6V3h2v4zM8 9H2V7h4V3h2v6zm10 8v4h-2v-6h6v2h-4zM8 15v6H6v-4H2v-2h6z\"\/>\n<\/svg><\/p>\n<\/div>\n<\/div>\n<\/div>\n<p>=> 4\u060c \u0627\u0628\u0639\u0627\u062f feature_matrix: (3009173\u060c 515)<\/p>\n<p>\u0627\u06cc\u0646 \u06a9\u062f \u06a9\u0627\u0631\u0647\u0627\u06cc \u0632\u06cc\u0631 \u0631\u0627 \u0627\u0646\u062c\u0627\u0645 \u0645\u06cc \u062f\u0647\u062f:<\/p>\n<ul>\n<li>\u06cc\u06a9 DataFrame ml_df \u062c\u062f\u06cc\u062f \u0628\u0627 \u0633\u062a\u0648\u0646\u200c\u0647\u0627\u06cc \u00abPULocationID\u00bb \u0648 \u00abDOlocationID\u00bb \u0627\u0632 filtered_duration \u0627\u06cc\u062c\u0627\u062f \u0645\u06cc\u200c\u06a9\u0646\u062f \u0648 \u0622\u0646\u0647\u0627 \u0631\u0627 \u0628\u0647 \u0631\u0634\u062a\u0647\u200c\u0647\u0627 \u062a\u0628\u062f\u06cc\u0644 \u0645\u06cc\u200c\u06a9\u0646\u062f.<\/li>\n<li>\u0633\u062a\u0648\u0646 &#8220;\u062f\u0648\u0631\u0647&#8221; \u0631\u0627 \u0627\u0632 filtered_duration \u0628\u0647 ml_df \u0627\u0636\u0627\u0641\u0647 \u0645\u06cc \u06a9\u0646\u062f.<\/li>\n<li>ml_df \u0631\u0627 \u0628\u0627 \u0627\u0633\u062a\u0641\u0627\u062f\u0647 \u0627\u0632 \u0631\u0648\u0634 to_dict \u0628\u0627 orient=&#8221;records&#8221; \u0628\u0647 \u0641\u0647\u0631\u0633\u062a\u06cc \u0627\u0632 \u062f\u06cc\u06a9\u0634\u0646\u0631\u06cc\u200c\u0647\u0627 \u0628\u0627 \u06a9\u0644\u06cc\u062f\u0647\u0627\u06cc &#8220;PULocationID&#8221; \u0648 &#8220;DOLocationID&#8221; \u062a\u0628\u062f\u06cc\u0644 \u0645\u06cc\u200c\u06a9\u0646\u062f.<\/li>\n<li>\u06cc\u06a9 DictVectorizer \u0631\u0627 \u0631\u0627\u0647 \u0627\u0646\u062f\u0627\u0632\u06cc \u0645\u06cc \u06a9\u0646\u062f \u06a9\u0647 \u0628\u0631\u0627\u06cc \u062a\u0628\u062f\u06cc\u0644 \u0644\u06cc\u0633\u062a \u0641\u0631\u0647\u0646\u06af \u0644\u063a\u062a \u0647\u0627 \u0628\u0647 \u0645\u0627\u062a\u0631\u06cc\u0633\u06cc \u0627\u0632 \u0648\u06cc\u0698\u06af\u06cc \u0647\u0627\u06cc \u0645\u062f\u0644 \u0647\u0627\u06cc \u06cc\u0627\u062f\u06af\u06cc\u0631\u06cc \u0645\u0627\u0634\u06cc\u0646 \u0627\u0633\u062a\u0641\u0627\u062f\u0647 \u0645\u06cc \u0634\u0648\u062f.<\/li>\n<li>\u0641\u0647\u0631\u0633\u062a \u062f\u06cc\u06a9\u0634\u0646\u0631\u06cc \u0647\u0627 \u0631\u0627 \u0628\u0647 \u06cc\u06a9 \u0645\u0627\u062a\u0631\u06cc\u0633 \u067e\u0631\u0627\u06a9\u0646\u062f\u0647 feature_matrix \u062a\u0628\u062f\u06cc\u0644 \u0645\u06cc \u06a9\u0646\u062f.<\/li>\n<li>\u0627\u0628\u0639\u0627\u062f feature_matrix \u0631\u0627 \u0686\u0627\u067e \u0645\u06cc \u06a9\u0646\u062f.<\/li>\n<li>\u062e\u0631\u0648\u062c\u06cc \u062a\u0639\u062f\u0627\u062f \u0633\u0637\u0631\u0647\u0627 \u0648 \u0633\u062a\u0648\u0646 \u0647\u0627 \u062f\u0631 \u0645\u0627\u062a\u0631\u06cc\u0633 \u0648\u06cc\u0698\u06af\u06cc \u0631\u0627 \u0646\u0634\u0627\u0646 \u0645\u06cc \u062f\u0647\u062f.<\/li>\n<\/ul>\n<p><strong>Q5: \u0622\u0645\u0648\u0632\u0634 \u06cc\u06a9 \u0645\u062f\u0644 \u0631\u06af\u0631\u0633\u06cc\u0648\u0646 \u062e\u0637\u06cc<\/strong><\/p>\n<div class=\"highlight js-code-highlight\">\n<pre class=\"highlight plaintext\"><code>## Linear Regression Model\nfrom sklearn.linear_model import LinearRegression\nfrom sklearn.metrics import mean_squared_error\n\ny = ml_df['duration']\n\nmodel = LinearRegression()\nmodel.fit(feature_matrix, y)\ny_pred = model.predict(feature_matrix)\nrmse = np.sqrt(mean_squared_error(y, y_pred))\n\nprint(f\"5, RMSE: {rmse}\")\n<\/code><\/pre>\n<div class=\"highlight__panel js-actions-panel\">\n<div class=\"highlight__panel-action js-fullscreen-code-action\">\n    <svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"20px\" height=\"20px\" viewbox=\"0 0 24 24\" class=\"highlight-action crayons-icon highlight-action--fullscreen-on\"><title>\u0648\u0627\u0631\u062f \u062d\u0627\u0644\u062a \u062a\u0645\u0627\u0645 \u0635\u0641\u062d\u0647 \u0634\u0648\u06cc\u062f<\/title>\n    <path d=\"M16 3h6v6h-2V5h-4V3zM2 3h6v2H4v4H2V3zm18 16v-4h2v6h-6v-2h4zM4 19h4v2H2v-6h2v4z\"\/>\n<\/svg><\/p>\n<p>    <svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"20px\" height=\"20px\" viewbox=\"0 0 24 24\" class=\"highlight-action crayons-icon highlight-action--fullscreen-off\"><title>\u0627\u0632 \u062d\u0627\u0644\u062a \u062a\u0645\u0627\u0645 \u0635\u0641\u062d\u0647 \u062e\u0627\u0631\u062c \u0634\u0648\u06cc\u062f<\/title>\n    <path d=\"M18 7h4v2h-6V3h2v4zM8 9H2V7h4V3h2v6zm10 8v4h-2v-6h6v2h-4zM8 15v6H6v-4H2v-2h6z\"\/>\n<\/svg><\/p>\n<\/div>\n<\/div>\n<\/div>\n<p>=> RMSE: 7.649262236295703<\/p>\n<p>\u062f\u0631 \u0627\u06cc\u0646\u062c\u0627\u060c \u0645\u0627 \u0628\u0647 <strong>\u0631\u06af\u0631\u0633\u06cc\u0648\u0646 \u062e\u0637\u06cc<\/strong> \u0645\u062f\u0644\u200c\u0647\u0627\u06cc\u06cc \u0627\u0632 \u06a9\u062a\u0627\u0628\u062e\u0627\u0646\u0647 scikit-learn\/sklearn \u0648 \u0622\u0645\u0648\u0632\u0634 \u0631\u0648\u06cc \u0645\u062a\u063a\u06cc\u0631 \u0647\u062f\u0641 <strong>\u0645\u062f\u062a \u0632\u0645\u0627\u0646<\/strong>\u060c \u0645\u062f\u0644 \u0631\u0627 \u062f\u0631 \u06cc\u06a9 \u0645\u0627\u062a\u0631\u06cc\u0633 \u0648\u06cc\u0698\u06af\u06cc \u0628\u0631\u0627\u0632\u0634 \u062f\u0647\u06cc\u062f \u0648 \u0633\u067e\u0633 \u067e\u06cc\u0634 \u0628\u06cc\u0646\u06cc \u06a9\u0646\u06cc\u062f. <\/p>\n<p>\u0631\u06cc\u0634\u0647 \u0645\u06cc\u0627\u0646\u06af\u06cc\u0646 \u0645\u0631\u0628\u0639\u0627\u062a \u062e\u0637\u0627 (RMSE) \u0628\u0631 \u0627\u0633\u0627\u0633 \u062a\u0641\u0627\u0648\u062a \u0628\u06cc\u0646 \u0645\u0642\u0627\u062f\u06cc\u0631 \u0648\u0627\u0642\u0639\u06cc \u0648 \u067e\u06cc\u0634 \u0628\u06cc\u0646\u06cc \u0634\u062f\u0647 \u0645\u062a\u063a\u06cc\u0631 \u0647\u062f\u0641 \u0645\u062d\u0627\u0633\u0628\u0647 \u0645\u06cc \u0634\u0648\u062f\u060c \u0647\u0631 \u0686\u0647 \u0645\u0642\u062f\u0627\u0631 \u06a9\u0645\u062a\u0631 \u0628\u0627\u0634\u062f\u060c \u0628\u0647\u062a\u0631 \u0627\u0633\u062a.<\/p>\n<p>Q6: \u0627\u0631\u0632\u06cc\u0627\u0628\u06cc \u0645\u062f\u0644<\/p>\n<p>\u062f\u0631 \u0627\u06cc\u0646\u062c\u0627\u060c \u062a\u0645\u0627\u0645 \u06a9\u0627\u0631\u0647\u0627\u06cc\u06cc \u06a9\u0647 \u0627\u0646\u062c\u0627\u0645 \u062f\u0627\u062f\u0647\u200c\u0627\u06cc\u0645 \u0631\u0627 \u0628\u0631\u0627\u06cc \u0627\u0639\u062a\u0628\u0627\u0631\u0633\u0646\u062c\u06cc \u0627\u0639\u0645\u0627\u0644 \u0645\u06cc\u200c\u06a9\u0646\u06cc\u0645 <em>\u0641\u0648\u0631\u06cc\u0647<\/em> \u0645\u062c\u0645\u0648\u0639\u0647 \u062f\u0627\u062f\u0647\u060c \u0628\u0647 \u0633\u0627\u062f\u06af\u06cc \u0628\u0627 \u0627\u06cc\u062c\u0627\u062f \u06cc\u06a9 \u062a\u0627\u0628\u0639:<\/p>\n<div class=\"highlight js-code-highlight\">\n<pre class=\"highlight plaintext\"><code>## Compile chunks into a function\ndef rmse_validation(df_pth: str):\n    val_df = pd.read_parquet(df_pth)\n    val_df[[\"tpep_pickup_datetime\", \"tpep_dropoff_datetime\"]] = val_df[[\"tpep_pickup_datetime\", \"tpep_dropoff_datetime\"]].apply(pd.to_datetime)\n    val_df[\"duration\"] = (val_df[\"tpep_dropoff_datetime\"] - val_df[\"tpep_pickup_datetime\"]).dt.total_seconds()\/60\n    val_df = val_df[val_df['duration'].between(1,60)]\n\n    val_df[['PULocationID', 'DOLocationID']] = val_df[['PULocationID', 'DOLocationID']].astype(str)\n    dicts_val = val_df[['PULocationID', 'DOLocationID']].to_dict(orient=\"records\")\n\n    feature_matrix_val = vec.transform(dicts_val)\n    #print(f\"Dimension of feature_matrix: {feature_matrix_val.shape} \\n\")\n\n    y_val = val_df['duration']\n    y_pred = model.predict(feature_matrix_val)\n    rmse = np.sqrt(mean_squared_error(y_val, y_pred))\n\n    return rmse\n\nresult_feb_df = rmse_validation(\".\/datasets\/feb_yellow.parquet\")\nprint(f\"6, Validation_RMSE: {result_feb_df}\")\n<\/code><\/pre>\n<div class=\"highlight__panel js-actions-panel\">\n<div class=\"highlight__panel-action js-fullscreen-code-action\">\n    <svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"20px\" height=\"20px\" viewbox=\"0 0 24 24\" class=\"highlight-action crayons-icon highlight-action--fullscreen-on\"><title>\u0648\u0627\u0631\u062f \u062d\u0627\u0644\u062a \u062a\u0645\u0627\u0645 \u0635\u0641\u062d\u0647 \u0634\u0648\u06cc\u062f<\/title>\n    <path d=\"M16 3h6v6h-2V5h-4V3zM2 3h6v2H4v4H2V3zm18 16v-4h2v6h-6v-2h4zM4 19h4v2H2v-6h2v4z\"\/>\n<\/svg><\/p>\n<p>    <svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"20px\" height=\"20px\" viewbox=\"0 0 24 24\" class=\"highlight-action crayons-icon highlight-action--fullscreen-off\"><title>\u0627\u0632 \u062d\u0627\u0644\u062a \u062a\u0645\u0627\u0645 \u0635\u0641\u062d\u0647 \u062e\u0627\u0631\u062c \u0634\u0648\u06cc\u062f<\/title>\n    <path d=\"M18 7h4v2h-6V3h2v4zM8 9H2V7h4V3h2v6zm10 8v4h-2v-6h6v2h-4zM8 15v6H6v-4H2v-2h6z\"\/>\n<\/svg><\/p>\n<\/div>\n<\/div>\n<\/div>\n<p>=> 6\u060c Validation_RMSE: 7.811812822882009<\/p>\n<p>\u062a\u0646\u0647\u0627 \u062a\u0641\u0627\u0648\u062a \u062f\u0631 \u0627\u06cc\u0646\u062c\u0627 \u062a\u0645\u0627\u06cc\u0632 \u0628\u06cc\u0646 \u0627\u0633\u062a <strong>\u062a\u0646\u0627\u0633\u0628_\u062a\u063a\u06cc\u06cc\u0631<\/strong> \u0648 <strong>\u062a\u0628\u062f\u06cc\u0644<\/strong> \u0647\u0645\u0627\u0646\u0637\u0648\u0631 \u06a9\u0647 \u062f\u0631 \u0645\u0648\u0631\u062f \u0628\u0631\u062f\u0627\u0631 \u0627\u0639\u0645\u0627\u0644 \u0645\u06cc \u0634\u0648\u062f\u060c \u0645\u0627 \u0627\u0632 transform \u062f\u0631 \u0645\u062c\u0645\u0648\u0639\u0647 \u0627\u0639\u062a\u0628\u0627\u0631 \u0633\u0646\u062c\u06cc \u0627\u0633\u062a\u0641\u0627\u062f\u0647 \u0645\u06cc \u06a9\u0646\u06cc\u0645 \u062a\u0627 \u0628\u0647 \u0633\u0627\u062f\u06af\u06cc \u062a\u0628\u062f\u06cc\u0644 \u0628\u0631\u0627\u0632\u0634 \u0631\u0627 \u06a9\u0647 \u0642\u0628\u0644\u0627 \u062f\u0631 \u0645\u062c\u0645\u0648\u0639\u0647 \u0622\u0645\u0648\u0632\u0634\u06cc \u0627\u0646\u062c\u0627\u0645 \u0634\u062f\u0647 \u0627\u0633\u062a \u0628\u0647 \u0627\u0631\u062b \u0628\u0628\u0631\u06cc\u0645.<\/p>\n<p>\u062e\u0648\u062f\u0634\u0647! <br \/>\u0628\u0631\u0627\u06cc \u0628\u0631\u0631\u0633\u06cc \u06a9\u062f\u0647\u0627 \u0628\u0647 wk1_submission \u0645\u0631\u0627\u062c\u0639\u0647 \u06a9\u0646\u06cc\u062f \u0648 \u0628\u0647 \u0633\u0644\u0627\u0645\u062a\u06cc!<br \/>\u062f\u0631 \u0635\u0648\u0631\u062a \u0648\u062c\u0648\u062f \u0647\u0631 \u06af\u0648\u0646\u0647 \u0645\u0634\u06a9\u0644 \u062f\u0631 \u0632\u06cc\u0631 \u0646\u0638\u0631 \u062f\u0647\u06cc\u062f.<\/p>\n<\/p><\/div>\n","protected":false},"excerpt":{"rendered":"<p>Summarize this content to 400 words in Persian Lang \u0627\u062e\u06cc\u0631\u0627\u064b \u0628\u0647 \u06af\u0631\u0648\u0647 DataTalks 2024 \u0645\u0644\u062d\u0642 \u0634\u062f\u0647 \u0627\u0633\u062a \u062a\u0627 \u0627\u0645\u062a\u06cc\u0627\u0632 \u06a9\u0633\u0628 \u06a9\u0646\u062f MLOs \u06af\u0648\u0627\u0647\u06cc \u0648 \u0627\u0633\u0627\u0633\u0627\u064b \u0628\u0631 \u0627\u0633\u0627\u0633 \u0634\u0627\u06cc\u0633\u062a\u06af\u06cc \u0647\u0627\u06cc Machine Pipeline \u0633\u0627\u062e\u062a\u0647 \u0634\u062f\u0647 \u0627\u0633\u062a. \u0628\u0631\u0627\u06cc \u062a\u06a9\u0645\u06cc\u0644 \u062f\u0648\u0631\u0647 \u062a\u06a9\u0627\u0644\u06cc\u0641\u06cc \u0627\u0633\u062a \u06a9\u0647 \u0628\u0627\u06cc\u062f \u06cc\u06a9 \u0647\u0641\u062a\u0647 \u062f\u0631 \u0645\u06cc\u0627\u0646 \u0627\u0646\u062c\u0627\u0645 \u0634\u0648\u062f. \u0627\u06cc\u0646 \u0645\u062c\u0645\u0648\u0639\u0647 \u0627\u06cc \u0627\u0632 \u0646\u062d\u0648\u0647 \u0628\u0631\u062e\u0648\u0631\u062f \u0646\u0648\u06cc\u0633\u0646\u062f\u0647 \u0628\u0627 &hellip;<\/p>\n","protected":false},"author":2,"featured_media":67079,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"fifu_image_url":"","fifu_image_alt":"","footnotes":""},"categories":[339],"tags":[],"class_list":["post-67078","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-dev"],"_links":{"self":[{"href":"https:\/\/nabfollower.com\/blog\/wp-json\/wp\/v2\/posts\/67078","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/nabfollower.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/nabfollower.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/nabfollower.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/nabfollower.com\/blog\/wp-json\/wp\/v2\/comments?post=67078"}],"version-history":[{"count":0,"href":"https:\/\/nabfollower.com\/blog\/wp-json\/wp\/v2\/posts\/67078\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/nabfollower.com\/blog\/wp-json\/wp\/v2\/media\/67079"}],"wp:attachment":[{"href":"https:\/\/nabfollower.com\/blog\/wp-json\/wp\/v2\/media?parent=67078"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/nabfollower.com\/blog\/wp-json\/wp\/v2\/categories?post=67078"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/nabfollower.com\/blog\/wp-json\/wp\/v2\/tags?post=67078"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}