{"id":79228,"date":"2024-10-05T23:54:47","date_gmt":"2024-10-05T20:24:47","guid":{"rendered":"https:\/\/nabfollower.com\/blog\/activation-functions-in-pytorch-4-a6i\/"},"modified":"2024-10-05T23:54:47","modified_gmt":"2024-10-05T20:24:47","slug":"activation-functions-in-pytorch-4-a6i","status":"publish","type":"post","link":"https:\/\/nabfollower.com\/blog\/activation-functions-in-pytorch-4-a6i\/","title":{"rendered":"\u062a\u0648\u0627\u0628\u0639 \u0641\u0639\u0627\u0644 \u0633\u0627\u0632\u06cc \u062f\u0631 PyTorch (4)"},"content":{"rendered":"<div data-article-id=\"2027070\" id=\"article-body\">\n<p>\u0628\u0631\u0627\u06cc \u0645\u0646 \u06cc\u06a9 \u0642\u0647\u0648\u0647 \u0628\u062e\u0631\u2615<\/p>\n<p>*\u06cc\u0627\u062f\u062f\u0627\u0634\u062a \u0647\u0627:<\/p>\n<p>(1) GELU (\u0648\u0627\u062d\u062f \u062e\u0637\u06cc \u062e\u0637\u0627\u06cc \u06af\u0627\u0648\u0633\u06cc):<\/p>\n<ul>\n<li>\u0645\u06cc \u062a\u0648\u0627\u0646\u062f \u06cc\u06a9 \u0645\u0642\u062f\u0627\u0631 \u0648\u0631\u0648\u062f\u06cc \u0631\u0627 \u062a\u0628\u062f\u06cc\u0644 \u06a9\u0646\u062f(<code>x<\/code>) \u0628\u0647 \u06cc\u06a9 \u0645\u0642\u062f\u0627\u0631 \u062e\u0631\u0648\u062c\u06cc \u0628\u0627 \u0627\u062d\u062a\u0645\u0627\u0644 \u0645\u0642\u062f\u0627\u0631 \u0648\u0631\u0648\u062f\u06cc \u062a\u062d\u062a \u06cc\u06a9 \u062a\u0648\u0632\u06cc\u0639 \u06af\u0627\u0648\u0633\u06cc \u0628\u0627 Tanh \u0627\u062e\u062a\u06cc\u0627\u0631\u06cc. *0 \u0627\u0646\u062d\u0635\u0627\u0631\u06cc \u0627\u0633\u062a \u0628\u0647 \u062c\u0632 \u0632\u0645\u0627\u0646\u06cc <code>x = 0<\/code>.<\/li>\n<li>\u0641\u0631\u0645\u0648\u0644 \u0622\u0646 \u0627\u0633\u062a. *\u0647\u0631 \u062f\u0648\u06cc \u0622\u0646\u0647\u0627 \u062a\u0642\u0631\u06cc\u0628\u0627\u064b \u0646\u062a\u0627\u06cc\u062c \u06cc\u06a9\u0633\u0627\u0646\u06cc \u062f\u0631\u06cc\u0627\u0641\u062a \u0645\u06cc \u06a9\u0646\u0646\u062f:<br \/>\n<br \/>\n\u06cc\u0627:<br \/>\n<img decoding=\"async\" src=\"https:\/\/media.dev.to\/dynamic\/image\/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto\/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F78eshvn0gj3lwgla1w5t.png\" alt=\"\u062a\u0648\u0636\u06cc\u062d\u0627\u062a \u062a\u0635\u0648\u06cc\u0631\" loading=\"lazy\" width=\"800\" height=\"165\" title=\"\">\n<\/li>\n<li>GELU() \u062f\u0631 PyTorch \u0627\u0633\u062a.<\/li>\n<li>\u0627\u0633\u062a\u0641\u0627\u062f\u0647 \u0645\u06cc \u0634\u0648\u062f \u062f\u0631:\n<ul>\n<li>\u062a\u0631\u0627\u0646\u0633\u0641\u0648\u0631\u0645\u0627\u062a\u0648\u0631. *Transformer() \u062f\u0631 PyTorch.<\/li>\n<li>NLP (\u067e\u0631\u062f\u0627\u0632\u0634 \u0632\u0628\u0627\u0646 \u0637\u0628\u06cc\u0639\u06cc) \u0645\u0628\u062a\u0646\u06cc \u0628\u0631 \u062a\u0631\u0627\u0646\u0633\u0641\u0648\u0631\u0645\u0627\u062a\u0648\u0631 \u0645\u0627\u0646\u0646\u062f ChatGPT\u060c BERT (\u0628\u0627\u0632\u0646\u0645\u0627\u06cc\u06cc \u0631\u0645\u0632\u06af\u0630\u0627\u0631 \u062f\u0648\u0637\u0631\u0641\u0647 \u0627\u0632 \u062a\u0631\u0627\u0646\u0633\u0641\u0648\u0631\u0645\u0627\u062a\u0648\u0631)\u060c \u0648 \u063a\u06cc\u0631\u0647.<\/li>\n<\/ul>\n<\/li>\n<li>\u062c\u0648\u0627\u0646\u0628 \u0645\u062b\u0628\u062a:\n<ul>\n<li>\u062a\u062e\u0641\u06cc\u0641 \u0645\u06cc \u062f\u0647\u062f <strong>\u0645\u0634\u06a9\u0644 \u0646\u0627\u067e\u062f\u06cc\u062f \u0634\u062f\u0646 \u06af\u0631\u0627\u062f\u06cc\u0627\u0646<\/strong>.<\/li>\n<li>\u062a\u062e\u0641\u06cc\u0641 \u0645\u06cc \u062f\u0647\u062f <strong>\u0645\u0634\u06a9\u0644 ReLU \u062f\u0631 \u062d\u0627\u0644 \u0645\u0631\u06af<\/strong>. *0 \u0647\u0646\u0648\u0632 \u0628\u0631\u0627\u06cc \u0645\u0642\u062f\u0627\u0631 \u0648\u0631\u0648\u062f\u06cc 0 \u062a\u0648\u0644\u06cc\u062f \u0645\u06cc \u0634\u0648\u062f <strong>\u0645\u0634\u06a9\u0644 ReLU \u062f\u0631 \u062d\u0627\u0644 \u0645\u0631\u06af<\/strong> \u0628\u0647 \u0637\u0648\u0631 \u06a9\u0627\u0645\u0644 \u0627\u062c\u062a\u0646\u0627\u0628 \u0646\u0645\u06cc \u0634\u0648\u062f.<\/li>\n<\/ul>\n<\/li>\n<li>\u0645\u0639\u0627\u06cc\u0628:\n<ul>\n<li>\u0628\u0647 \u062f\u0644\u06cc\u0644 \u0639\u0645\u0644\u06cc\u0627\u062a \u067e\u06cc\u0686\u06cc\u062f\u0647 \u0627\u0632 \u062c\u0645\u0644\u0647 Erf (\u0639\u0645\u0644\u06a9\u0631\u062f \u062e\u0637\u0627) \u06cc\u0627 Tanh \u0627\u0632 \u0646\u0638\u0631 \u0645\u062d\u0627\u0633\u0628\u0627\u062a\u06cc \u06af\u0631\u0627\u0646 \u0627\u0633\u062a.<\/li>\n<\/ul>\n<\/li>\n<li>\u0646\u0645\u0648\u062f\u0627\u0631 \u062f\u0631 Desmos:<\/li>\n<\/ul>\n<p><img decoding=\"async\" src=\"https:\/\/media.dev.to\/dynamic\/image\/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto\/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2r918i0u431ieh25qugx.png\" alt=\"\u062a\u0648\u0636\u06cc\u062d\u0627\u062a \u062a\u0635\u0648\u06cc\u0631\" loading=\"lazy\" width=\"800\" height=\"441\" title=\"\"><\/p>\n<p>(2) \u0645\u06cc\u0634:<\/p>\n<ul>\n<li>\u0645\u06cc \u062a\u0648\u0627\u0646\u062f \u06cc\u06a9 \u0645\u0642\u062f\u0627\u0631 \u0648\u0631\u0648\u062f\u06cc \u0631\u0627 \u062a\u0628\u062f\u06cc\u0644 \u06a9\u0646\u062f(<code>x<\/code>) \u0628\u0647 \u06cc\u06a9 \u0645\u0642\u062f\u0627\u0631 \u062e\u0631\u0648\u062c\u06cc \u062a\u0648\u0633\u0637 <code>x * Tanh(Softplus(x))<\/code>. *0 \u0627\u0646\u062d\u0635\u0627\u0631\u06cc \u0627\u0633\u062a \u0628\u0647 \u062c\u0632 \u0632\u0645\u0627\u0646\u06cc <code>x = 0<\/code>.<\/li>\n<li>\u0641\u0631\u0645\u0648\u0644 \u0627\u06cc\u0646 \u0627\u0633\u062a:<br \/>\n<img decoding=\"async\" src=\"https:\/\/media.dev.to\/dynamic\/image\/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto\/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnxitk343djh0hto40khg.png\" alt=\"\u062a\u0648\u0636\u06cc\u062d\u0627\u062a \u062a\u0635\u0648\u06cc\u0631\" loading=\"lazy\" width=\"800\" height=\"278\" title=\"\">\n<\/li>\n<li>Mish() \u062f\u0631 PyTorch \u0627\u0633\u062a.<\/li>\n<li>\u062c\u0648\u0627\u0646\u0628 \u0645\u062b\u0628\u062a:\n<ul>\n<li>\u062a\u062e\u0641\u06cc\u0641 \u0645\u06cc \u062f\u0647\u062f <strong>\u0645\u0634\u06a9\u0644 \u0646\u0627\u067e\u062f\u06cc\u062f \u0634\u062f\u0646 \u06af\u0631\u0627\u062f\u06cc\u0627\u0646<\/strong>.<\/li>\n<li>\u062a\u062e\u0641\u06cc\u0641 \u0645\u06cc \u062f\u0647\u062f <strong>\u0645\u0634\u06a9\u0644 ReLU \u062f\u0631 \u062d\u0627\u0644 \u0645\u0631\u06af<\/strong>. *0 \u0647\u0646\u0648\u0632 \u0628\u0631\u0627\u06cc \u0645\u0642\u062f\u0627\u0631 \u0648\u0631\u0648\u062f\u06cc 0 \u062a\u0648\u0644\u06cc\u062f \u0645\u06cc \u0634\u0648\u062f <strong>\u0645\u0634\u06a9\u0644 ReLU \u062f\u0631 \u062d\u0627\u0644 \u0645\u0631\u06af<\/strong> \u0628\u0647 \u0637\u0648\u0631 \u06a9\u0627\u0645\u0644 \u0627\u062c\u062a\u0646\u0627\u0628 \u0646\u0645\u06cc \u0634\u0648\u062f.<\/li>\n<\/ul>\n<\/li>\n<li>\u0645\u0639\u0627\u06cc\u0628:\n<ul>\n<li>\u0628\u0647 \u062f\u0644\u06cc\u0644 \u0639\u0645\u0644\u06a9\u0631\u062f Tanh \u0648 Softplus \u0627\u0632 \u0646\u0638\u0631 \u0645\u062d\u0627\u0633\u0628\u0627\u062a\u06cc \u06af\u0631\u0627\u0646 \u0627\u0633\u062a.<\/li>\n<\/ul>\n<\/li>\n<li>\u0646\u0645\u0648\u062f\u0627\u0631 \u062f\u0631 Desmos:<\/li>\n<\/ul>\n<p><img decoding=\"async\" src=\"https:\/\/media.dev.to\/dynamic\/image\/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto\/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7zb350r3ni4nvc22evix.png\" alt=\"\u062a\u0648\u0636\u06cc\u062d\u0627\u062a \u062a\u0635\u0648\u06cc\u0631\" loading=\"lazy\" width=\"800\" height=\"441\" title=\"\"><\/p>\n<p>(3) SiLU (\u0648\u0627\u062d\u062f\u0647\u0627\u06cc \u062e\u0637\u06cc \u0648\u0632\u0646\u06cc \u0633\u06cc\u06af\u0645\u0648\u0626\u06cc\u062f):<\/p>\n<ul>\n<li>\u0645\u06cc \u062a\u0648\u0627\u0646\u062f \u06cc\u06a9 \u0645\u0642\u062f\u0627\u0631 \u0648\u0631\u0648\u062f\u06cc \u0631\u0627 \u062a\u0628\u062f\u06cc\u0644 \u06a9\u0646\u062f(<code>x<\/code>) \u0628\u0647 \u06cc\u06a9 \u0645\u0642\u062f\u0627\u0631 \u062e\u0631\u0648\u062c\u06cc \u062a\u0648\u0633\u0637 <code>x * Sigmoid(x)<\/code>. *0 \u0627\u0646\u062d\u0635\u0627\u0631\u06cc \u0627\u0633\u062a \u0628\u0647 \u062c\u0632 \u0632\u0645\u0627\u0646\u06cc <code>x = 0<\/code>.<\/li>\n<li>\u0641\u0631\u0645\u0648\u0644 y = \u0627\u0633\u062a <code>x<\/code> \/ (1 + e<sup>&#8211;<code>x<\/code><\/sup>).<\/li>\n<li>Swish \u0646\u06cc\u0632 \u0646\u0627\u0645\u06cc\u062f\u0647 \u0645\u06cc \u0634\u0648\u062f.<\/li>\n<li>SiLU() \u062f\u0631 PyTorch \u0627\u0633\u062a.<\/li>\n<li>\u062c\u0648\u0627\u0646\u0628 \u0645\u062b\u0628\u062a:\n<ul>\n<li>\u062a\u062e\u0641\u06cc\u0641 \u0645\u06cc \u062f\u0647\u062f <strong>\u0645\u0634\u06a9\u0644 \u0646\u0627\u067e\u062f\u06cc\u062f \u0634\u062f\u0646 \u06af\u0631\u0627\u062f\u06cc\u0627\u0646<\/strong>.<\/li>\n<li>\u062a\u062e\u0641\u06cc\u0641 \u0645\u06cc \u062f\u0647\u062f <strong>\u0645\u0634\u06a9\u0644 ReLU \u062f\u0631 \u062d\u0627\u0644 \u0645\u0631\u06af<\/strong>. *0 \u0647\u0646\u0648\u0632 \u0628\u0631\u0627\u06cc \u0645\u0642\u062f\u0627\u0631 \u0648\u0631\u0648\u062f\u06cc 0 \u062a\u0648\u0644\u06cc\u062f \u0645\u06cc \u0634\u0648\u062f <strong>\u0645\u0634\u06a9\u0644 ReLU \u062f\u0631 \u062d\u0627\u0644 \u0645\u0631\u06af<\/strong> \u0628\u0647 \u0637\u0648\u0631 \u06a9\u0627\u0645\u0644 \u0627\u062c\u062a\u0646\u0627\u0628 \u0646\u0645\u06cc \u0634\u0648\u062f.<\/li>\n<\/ul>\n<\/li>\n<li>\u0645\u0639\u0627\u06cc\u0628:\n<ul>\n<li>\u0627\u0632 \u0646\u0638\u0631 \u0645\u062d\u0627\u0633\u0628\u0627\u062a\u06cc \u0628\u0647 \u062f\u0644\u06cc\u0644 Sigmoid \u06af\u0631\u0627\u0646 \u0627\u0633\u062a.<\/li>\n<\/ul>\n<\/li>\n<li>\u0646\u0645\u0648\u062f\u0627\u0631 \u062f\u0631 Desmos:<\/li>\n<\/ul>\n<p><img decoding=\"async\" src=\"https:\/\/media.dev.to\/dynamic\/image\/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto\/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F050eawd4od6f8ht23c29.png\" alt=\"\u062a\u0648\u0636\u06cc\u062d\u0627\u062a \u062a\u0635\u0648\u06cc\u0631\" loading=\"lazy\" width=\"800\" height=\"444\" title=\"\"><\/p>\n<p>(4) \u0633\u0627\u0641\u062a \u067e\u0644\u0627\u0633:<\/p>\n<ul>\n<li>\u0645\u06cc \u062a\u0648\u0627\u0646\u062f \u06cc\u06a9 \u0645\u0642\u062f\u0627\u0631 \u0648\u0631\u0648\u062f\u06cc \u0631\u0627 \u062a\u0628\u062f\u06cc\u0644 \u06a9\u0646\u062f(<code>x<\/code>) \u0628\u0647 \u0645\u0642\u062f\u0627\u0631 \u062e\u0631\u0648\u062c\u06cc \u0628\u06cc\u0646 0 \u0648 \u221e. *0 \u0627\u0646\u062d\u0635\u0627\u0631\u06cc \u0627\u0633\u062a.<\/li>\n<li>\u0641\u0631\u0645\u0648\u0644 y = log(1+e<sup>x<\/sup>).<\/li>\n<li>Softplus() \u062f\u0631 PyTorch \u0627\u0633\u062a.<\/li>\n<li>\u062c\u0648\u0627\u0646\u0628 \u0645\u062b\u0628\u062a:\n<ul>\n<li>\u0645\u0642\u0627\u062f\u06cc\u0631 \u0648\u0631\u0648\u062f\u06cc \u0631\u0627 \u0639\u0627\u062f\u06cc \u0645\u06cc \u06a9\u0646\u062f.<\/li>\n<li>\u0647\u0645\u06af\u0631\u0627\u06cc\u06cc \u067e\u0627\u06cc\u062f\u0627\u0631 \u0627\u0633\u062a.<\/li>\n<li>\u062a\u062e\u0641\u06cc\u0641 \u0645\u06cc \u062f\u0647\u062f <strong>\u0645\u0634\u06a9\u0644 \u06af\u0631\u0627\u062f\u06cc\u0627\u0646 \u0646\u0627\u067e\u062f\u06cc\u062f \u0634\u062f\u0646<\/strong>.<\/li>\n<li>\u062a\u062e\u0641\u06cc\u0641 \u0645\u06cc \u062f\u0647\u062f <strong>\u0645\u0634\u06a9\u0644 \u06af\u0631\u0627\u062f\u06cc\u0627\u0646 \u0627\u0646\u0641\u062c\u0627\u0631\u06cc<\/strong>.<\/li>\n<li>\u0627\u062c\u062a\u0646\u0627\u0628 \u0645\u06cc \u06a9\u0646\u062f <strong>\u0645\u0634\u06a9\u0644 ReLU \u062f\u0631 \u062d\u0627\u0644 \u0645\u0631\u06af<\/strong>.<\/li>\n<\/ul>\n<\/li>\n<li>\u0645\u0639\u0627\u06cc\u0628:\n<ul>\n<li>\u0627\u0632 \u0646\u0638\u0631 \u0645\u062d\u0627\u0633\u0628\u0627\u062a\u06cc \u06af\u0631\u0627\u0646 \u0627\u0633\u062a \u0628\u0647 \u062f\u0644\u06cc\u0644 \u0639\u0645\u0644\u06cc\u0627\u062a \u062b\u0628\u062a \u0648 \u0646\u0645\u0627\u06cc\u06cc.<\/li>\n<\/ul>\n<\/li>\n<li>\u0646\u0645\u0648\u062f\u0627\u0631 \u062f\u0631 Desmos:<\/li>\n<\/ul>\n<p><img decoding=\"async\" src=\"https:\/\/media.dev.to\/dynamic\/image\/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto\/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy6oolomdjnldtctteh17.png\" alt=\"\u062a\u0648\u0636\u06cc\u062d\u0627\u062a \u062a\u0635\u0648\u06cc\u0631\" loading=\"lazy\" width=\"800\" height=\"434\" title=\"\"><\/p>\n<\/p><\/div>\n","protected":false},"excerpt":{"rendered":"<p>\u0628\u0631\u0627\u06cc \u0645\u0646 \u06cc\u06a9 \u0642\u0647\u0648\u0647 \u0628\u062e\u0631\u2615 *\u06cc\u0627\u062f\u062f\u0627\u0634\u062a \u0647\u0627: (1) GELU (\u0648\u0627\u062d\u062f \u062e\u0637\u06cc \u062e\u0637\u0627\u06cc \u06af\u0627\u0648\u0633\u06cc): \u0645\u06cc \u062a\u0648\u0627\u0646\u062f \u06cc\u06a9 \u0645\u0642\u062f\u0627\u0631 \u0648\u0631\u0648\u062f\u06cc \u0631\u0627 \u062a\u0628\u062f\u06cc\u0644 \u06a9\u0646\u062f(x) \u0628\u0647 \u06cc\u06a9 \u0645\u0642\u062f\u0627\u0631 \u062e\u0631\u0648\u062c\u06cc \u0628\u0627 \u0627\u062d\u062a\u0645\u0627\u0644 \u0645\u0642\u062f\u0627\u0631 \u0648\u0631\u0648\u062f\u06cc \u062a\u062d\u062a \u06cc\u06a9 \u062a\u0648\u0632\u06cc\u0639 \u06af\u0627\u0648\u0633\u06cc \u0628\u0627 Tanh \u0627\u062e\u062a\u06cc\u0627\u0631\u06cc. *0 \u0627\u0646\u062d\u0635\u0627\u0631\u06cc \u0627\u0633\u062a \u0628\u0647 \u062c\u0632 \u0632\u0645\u0627\u0646\u06cc x = 0. \u0641\u0631\u0645\u0648\u0644 \u0622\u0646 \u0627\u0633\u062a. *\u0647\u0631 \u062f\u0648\u06cc \u0622\u0646\u0647\u0627 \u062a\u0642\u0631\u06cc\u0628\u0627\u064b \u0646\u062a\u0627\u06cc\u062c \u06cc\u06a9\u0633\u0627\u0646\u06cc \u062f\u0631\u06cc\u0627\u0641\u062a &hellip;<\/p>\n","protected":false},"author":2,"featured_media":79229,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"fifu_image_url":"","fifu_image_alt":"","footnotes":""},"categories":[339],"tags":[],"class_list":["post-79228","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-dev"],"_links":{"self":[{"href":"https:\/\/nabfollower.com\/blog\/wp-json\/wp\/v2\/posts\/79228","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/nabfollower.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/nabfollower.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/nabfollower.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/nabfollower.com\/blog\/wp-json\/wp\/v2\/comments?post=79228"}],"version-history":[{"count":0,"href":"https:\/\/nabfollower.com\/blog\/wp-json\/wp\/v2\/posts\/79228\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/nabfollower.com\/blog\/wp-json\/wp\/v2\/media\/79229"}],"wp:attachment":[{"href":"https:\/\/nabfollower.com\/blog\/wp-json\/wp\/v2\/media?parent=79228"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/nabfollower.com\/blog\/wp-json\/wp\/v2\/categories?post=79228"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/nabfollower.com\/blog\/wp-json\/wp\/v2\/tags?post=79228"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}