marcodaniels / elm-robots-humans / Robots

https://moz.com/learn/seo/robotstxt

Include a Robots.txt file to instruct robots (search engines) how and what pages to crawl in your website.


type Value
    = SingleValue String
    | MultiValue (List String)

The value type used for most entries of robots


type alias Policy =
{ userAgent : Value
, allow : Maybe Value
, disallow : Maybe Value 
}

Policy type for robots.txt policies


type alias Robots =
{ policies : List (PolicyExtra Policy)
, host : String
, sitemap : Value 
}

Robots.txt input type

policy : Policy -> PolicyExtra Policy

Create a robots.txt policy entry

policy
    { userAgent = SingleValue "*"
    , allow = Just (SingleValue "*")
    , disallow = Nothing
    }

robots : Robots -> String

Creates a String with the robots.txt output

robots
    { sitemap = SingleValue "/sitemap.xml"
    , host = "https://marcodaniels.com"
    , policies =
        [ policy
            { userAgent = SingleValue "*"
            , allow = Just (SingleValue "*")
            , disallow = Nothing
            }
        ]
    }

withCrawlDelay : Basics.Int -> PolicyExtra Policy -> PolicyExtra Policy

Add crawl-delay property to the policy entry

policy
    { userAgent = SingleValue "*"
    , allow = Just (SingleValue "*")
    , disallow = Nothing
    }
    |> withCrawlDelay 10


type alias CleanParam =
{ param : String
, path : String 
}

Clean-param type for withCleanParam

withCleanParam : List CleanParam -> PolicyExtra Policy -> PolicyExtra Policy

Add clean param property to the policy entry

policy
    { userAgent = SingleValue "*"
    , allow = Just (SingleValue "*")
    , disallow = Nothing
    }
    |> withCleanParam [ { param = "id", path = "/user" } ]