diff --git a/CODESMELL.md b/CODESMELL.md index 8f8032c..13152c9 100644 --- a/CODESMELL.md +++ b/CODESMELL.md @@ -262,11 +262,11 @@ This suggests data isn't normalized at boundaries. Prefer atoms for internal str 2. **Extract filesystem / Search side effects out of `Repo.transaction` in `BDS.Media`.** ✅ **DONE 2026-04-30.** See "Priority #2 Completion" section below. 3. **Fix `MCP.atomize_keys`** to use `String.to_existing_atom/1` with a string-fallback. ✅ **DONE 2026-04-30.** See "Priority #3 Completion" section below. 4. **Introduce `BDS.PostMedia` Ecto schema** and migrate the 6–8 raw `post_media` queries. ✅ **DONE 2026-04-30.** See "Priority #4 Completion" section below. -5. **Replace `Repo.get` calls in `ShellLive`** with context functions (add new context functions where needed). -6. **Move locale from `Process.put` into assigns**, then ban `Process.put` via Credo. -7. **Extract shared helpers** (`attr/2`, `maybe_put/3`, `blank_to_nil/1`, `progress_callback/1`, rebuild progress reporters) into `BDS.MapUtils` / `BDS.ProgressReporter`. -8. **Wrap external `Jason.decode!` calls** in `BDS.AI.OpenAICompatibleRuntime` and `BDS.AI` with `Jason.decode/1` + `{:error, _}` propagation. -9. **Module split.** `BDS.Generation` (2624) and `BDS.Desktop.ShellLive` (2607) first; schedule each as its own sprint. +5. **Module split.** `BDS.Generation` (2624) and `BDS.Desktop.ShellLive` (2607) first, then `BDS.AI` (1700+) and `BDS.Posts`. ✅ **PARTIAL 2026-04-30.** `BDS.Generation` reduced 2651 → 1873 (29%). See "Priority #5 Progress" section below. +6. **Replace `Repo.get` calls in `ShellLive`** with context functions (add new context functions where needed). +7. **Move locale from `Process.put` into assigns**, then ban `Process.put` via Credo. +8. **Extract shared helpers** (`attr/2`, `maybe_put/3`, `blank_to_nil/1`, `progress_callback/1`, rebuild progress reporters) into `BDS.MapUtils` / `BDS.ProgressReporter`. +9. **Wrap external `Jason.decode!` calls** in `BDS.AI.OpenAICompatibleRuntime` and `BDS.AI` with `Jason.decode/1` + `{:error, _}` propagation. ### Skipped / downgraded @@ -407,6 +407,36 @@ Introduced [lib/bds/posts/post_media.ex](lib/bds/posts/post_media.ex) — a prop --- +## Priority #5 Progress (2026-04-30) + +**Goal:** Split god modules. Started with the worst offender, `BDS.Generation` (2651 lines). + +**Result:** `lib/bds/generation.ex` reduced **2651 → 1873 lines (29%)** by extracting six cohesive submodules under `lib/bds/generation/`: + +| Module | Lines | Responsibility | +|---|---|---| +| `BDS.Generation.Paths` | 262 | URL/route/path helpers, language prefixing, pagination math, archive routing | +| `BDS.Generation.Sitemap` | 280 | sitemap.xml, RSS/Atom feeds, calendar feed, hreflang link assembly | +| `BDS.Generation.Renderers` | 227 | Liquid template rendering wrappers (home, post, archive, date, list, 404) | +| `BDS.Generation.Progress` | 96 | Generation/validation progress callback helpers | +| `BDS.Generation.Pagefind` | 70 | Pagefind search-index input file emission | +| `BDS.Generation.GeneratedFileHash` | 23 | (pre-existing) hash-tracking schema | + +Total: 958 lines now live in focused submodules; the remaining 1873 in `BDS.Generation` is mostly the validation engine, output builders, and snapshot/data assembly — candidates for the next iteration. + +**Refactor pattern used:** `import BDS.Generation.X, only: [...]` (or `except: [...]`) at the head of `BDS.Generation` so the hundreds of internal call sites needed no changes; `defdelegate` for any function that had to remain reachable through the public `BDS.Generation` namespace (e.g. `post_output_path/1,2`). + +**Validation after each extraction:** `mix compile --warnings-as-errors` clean, `mix dialyzer --format short` 0 errors, `mix test` 342/0/4. + +**Remaining work in this priority** (in suggested order of decreasing isolation): + +1. `BDS.Generation.Outputs` — extract the `build_*_outputs/*` family and `build_validation_route_paths` (~600 lines). +2. `BDS.Generation.Data` — extract `generation_data/2`, snapshot loaders, post-index builders (~300 lines). +3. `BDS.Generation.Validation` — extract `compare_sitemap_to_html`, `classify_validation_path`, `build_targeted_validation_plan`, `delete_extra_validation_paths`, `write_ancillary_validation_outputs` (~600 lines). Most coupled — do last. +4. After `BDS.Generation`, repeat the pattern on `BDS.Desktop.ShellLive` (2607), `BDS.Posts` (1781), `BDS.AI` (1711), `BDS.MCP` (677). + +--- + ## Bottom Line The biggest risks are **module size** and **duplicated helpers**, followed by the **process dictionary i18n** and **side effects in transactions**. Fixing the top 5 anti-patterns would significantly improve maintainability, testability, and reliability of the desktop app over long-running sessions. diff --git a/lib/bds/generation.ex b/lib/bds/generation.ex index c3117f1..b944068 100644 --- a/lib/bds/generation.ex +++ b/lib/bds/generation.ex @@ -2,17 +2,31 @@ defmodule BDS.Generation do @moduledoc false import Ecto.Query + import BDS.Generation.Paths, + except: [post_output_path: 1, post_output_path: 2] + import BDS.Generation.Sitemap, + only: [ + render: 1, + render_multi_language: 6, + render_feed: 3, + render_atom: 3, + render_calendar: 1, + extract_locs: 1, + loc_to_project_path: 2 + ] + import BDS.Generation.Renderers + import BDS.Generation.Progress alias BDS.DocumentFields alias BDS.Frontmatter alias BDS.Generation.GeneratedFileHash + alias BDS.Generation.Paths alias BDS.Metadata alias BDS.Persistence alias BDS.PreviewAssets alias BDS.Posts.Post alias BDS.Posts.Translation alias BDS.Projects - alias BDS.Rendering alias BDS.Repo alias BDS.Slug @@ -61,7 +75,7 @@ defmodule BDS.Generation do when is_binary(project_id) and is_list(sections) and is_list(opts) do with {:ok, plan} <- plan_generation(project_id, sections) do outputs = build_outputs(plan) - on_progress = progress_callback(opts) + on_progress = callback(opts) total_outputs = length(outputs) :ok = report_generation_started(on_progress, total_outputs, "generated files") @@ -84,7 +98,7 @@ defmodule BDS.Generation do def validate_site(project_id, sections, opts) when is_binary(project_id) and is_list(sections) and is_list(opts) do with {:ok, plan} <- plan_generation(project_id, sections) do - on_progress = progress_callback(opts) + on_progress = callback(opts) :ok = report_validation_progress(on_progress, 0.0, "Collecting sitemap URLs...") data = @@ -145,67 +159,6 @@ defmodule BDS.Generation do end end - defp progress_callback(opts) do - case Keyword.get(opts, :on_progress) do - callback when is_function(callback, 2) -> callback - _other -> nil - end - end - - defp report_generation_started(nil, _total, _label), do: :ok - - defp report_generation_started(callback, 0, label) do - callback.(1.0, "No #{label} to process") - :ok - end - - defp report_generation_started(callback, total, label) do - callback.(0.0, "Processing 0/#{total} #{label}") - :ok - end - - defp report_generation_progress(nil, _current, _total, _label), do: :ok - defp report_generation_progress(_callback, _current, 0, _label), do: :ok - - defp report_generation_progress(callback, current, total, label) do - callback.(current / total, "Processing #{current}/#{total} #{label}") - :ok - end - - defp report_validation_progress(nil, _progress, _message), do: :ok - - defp report_validation_progress(callback, progress, message) do - callback.(progress, message) - :ok - end - - defp report_validation_snapshot_progress(nil, _stage, _current, _total), do: :ok - - defp report_validation_snapshot_progress(_callback, _stage, _current, total) - when total <= 0, - do: :ok - - defp report_validation_snapshot_progress(callback, :posts, current, total) do - progress = min(0.18, current / total * 0.18) - callback.(progress, "Collecting sitemap URLs... #{current}/#{total}") - :ok - end - - defp report_validation_snapshot_progress(callback, :translations, current, total) do - progress = 0.18 + min(0.12, current / total * 0.12) - callback.(progress, "Collecting sitemap URLs... #{current}/#{total}") - :ok - end - - defp report_validation_collection_progress(nil, _current, _total), do: :ok - defp report_validation_collection_progress(_callback, _current, total) when total <= 0, do: :ok - - defp report_validation_collection_progress(callback, current, total) do - progress = min(0.49, 0.30 + current / total * 0.19) - callback.(progress, "Collecting sitemap URLs... #{current}/#{total}") - :ok - end - @spec apply_validation(String.t(), [section()] | map()) :: {:ok, map()} | {:error, term()} def apply_validation(project_id, sections) when is_binary(project_id) and is_list(sections) do with {:ok, plan} <- plan_generation(project_id, sections) do @@ -302,23 +255,10 @@ defmodule BDS.Generation do end @spec post_output_path(map()) :: String.t() - def post_output_path(post), do: post_output_path(post, nil) + defdelegate post_output_path(post), to: Paths @spec post_output_path(map(), String.t() | nil) :: String.t() - def post_output_path(post, language) when is_map(post) do - {year, month, day} = local_date_parts!(post.created_at) - year = Integer.to_string(year) - month = month |> Integer.to_string() |> String.pad_leading(2, "0") - day = day |> Integer.to_string() |> String.pad_leading(2, "0") - - path_parts = [year, month, day, post.slug, "index.html"] - - case language do - nil -> Path.join(path_parts) - "" -> Path.join(path_parts) - value -> Path.join([value | path_parts]) - end - end + defdelegate post_output_path(post, language), to: Paths @typedoc "Result returned by `write_generated_file/3,4`." @type write_result :: %{relative_path: String.t(), content_hash: String.t(), written?: boolean()} @@ -764,14 +704,14 @@ defmodule BDS.Generation do sitemap = if :core in plan.sections do - [{"sitemap.xml", render_sitemap(urls)}] + [{"sitemap.xml", render(urls)}] else [] end pagefind_outputs = if :core in plan.sections do - build_pagefind_outputs(plan, core_outputs ++ page_outputs ++ single_outputs ++ archive_outputs) + BDS.Generation.Pagefind.build_outputs(plan, core_outputs ++ page_outputs ++ single_outputs ++ archive_outputs) else [] end @@ -827,7 +767,7 @@ defmodule BDS.Generation do sitemap_content = main_paths |> Enum.map(&url_for_output(plan.base_url, &1)) - |> render_sitemap() + |> render() additional_expected_paths = additional_language_sets @@ -850,7 +790,7 @@ defmodule BDS.Generation do [] -> sitemap_content languages -> - render_multi_language_sitemap( + render_multi_language( plan, Enum.reject(data.published_posts, &truthy_flag?(Map.get(&1, :do_not_translate))), Enum.filter(data.published_posts, &truthy_flag?(Map.get(&1, :do_not_translate))), @@ -987,8 +927,6 @@ defmodule BDS.Generation do end) end - defp truthy_flag?(value), do: value not in [false, nil] - defp disk_generated_files(project_id) do project = Projects.get_project!(project_id) html_root = output_path(project, "") @@ -1321,91 +1259,6 @@ defmodule BDS.Generation do end) end - defp paginated_archive_paths(route_language, segments, total_items, max_posts_per_page) do - total_pages = page_count(total_items, max_posts_per_page) - - Enum.map(1..total_pages, fn page_number -> - archive_path(route_language, segments, page_number) - end) - end - - defp root_route_paths(route_language, total_items, max_posts_per_page) do - total_pages = page_count(total_items, max_posts_per_page) - - Enum.map(1..total_pages, fn page_number -> - root_output_path(route_language, page_number) - end) - end - - defp root_output_path(nil, 1), do: "index.html" - defp root_output_path("", 1), do: "index.html" - defp root_output_path(route_language, 1), do: Path.join(route_language, "index.html") - defp root_output_path(nil, page_number), do: Path.join(["page", Integer.to_string(page_number), "index.html"]) - defp root_output_path("", page_number), do: root_output_path(nil, page_number) - defp root_output_path(route_language, page_number), do: Path.join([route_language, "page", Integer.to_string(page_number), "index.html"]) - - defp page_output_path(slug, nil), do: Path.join([slug, "index.html"]) - defp page_output_path(slug, ""), do: page_output_path(slug, nil) - defp page_output_path(slug, language), do: Path.join([language, slug, "index.html"]) - - defp pagination_for_page(page_number, total_pages, total_items, items_per_page, route_language, segments) do - %{ - current_page: page_number, - total_pages: total_pages, - total_items: total_items, - items_per_page: items_per_page, - has_prev_page: page_number > 1, - prev_page_href: archive_or_root_href(route_language, segments, page_number - 1), - has_next_page: page_number < total_pages, - next_page_href: archive_or_root_href(route_language, segments, page_number + 1) - } - end - - defp archive_or_root_href(_route_language, _segments, page_number) when page_number < 1, do: "" - defp archive_or_root_href(route_language, [], page_number), do: root_page_href(route_language, page_number) - defp archive_or_root_href(route_language, segments, page_number), do: archive_href(route_language, segments, page_number) - - defp root_page_href(route_language, page_number) when page_number <= 1 do - case route_language do - nil -> "/" - "" -> "/" - language -> "/#{language}/" - end - end - - defp root_page_href(route_language, page_number) do - base = - case route_language do - nil -> "" - "" -> "" - language -> "/#{language}" - end - - "#{base}/page/#{page_number}/" - end - - defp page_count(total_items, _max_posts_per_page) when total_items <= 0, do: 1 - - defp page_count(total_items, max_posts_per_page) do - page_size = max(max_posts_per_page, 1) - div(total_items + page_size - 1, page_size) - end - - defp paginate_posts(posts, max_posts_per_page) do - case Enum.chunk_every(posts, max(max_posts_per_page, 1)) do - [] -> [[]] - chunks -> chunks - end - end - - defp report_snapshot_stage_progress(nil, _stage, _current, _total), do: :ok - defp report_snapshot_stage_progress(_callback, _stage, _current, total) when total <= 0, do: :ok - - defp report_snapshot_stage_progress(callback, stage, current, total) do - callback.(stage, current, total) - :ok - end - defp build_single_outputs( project_id, main_language, @@ -1480,35 +1333,6 @@ defmodule BDS.Generation do end end - defp archive_path(language, segments, 1), do: archive_path(language, segments) - - defp archive_path(language, segments, page_number) do - archive_path(language, segments ++ ["page", Integer.to_string(page_number)]) - end - - defp archive_path(nil, segments), do: Path.join(segments ++ ["index.html"]) - defp archive_path("", segments), do: Path.join(segments ++ ["index.html"]) - - defp archive_path(language, segments) do - prefix = if language in [nil, ""], do: [], else: [language] - Path.join(prefix ++ segments ++ ["index.html"]) - end - - defp archive_route_segment(nil), do: "" - defp archive_route_segment(value), do: value |> to_string() |> URI.encode(&URI.char_unreserved?/1) - - defp normalize_base_url(nil), do: nil - defp normalize_base_url(url), do: String.trim_trailing(url, "/") - - defp normalize_blog_languages(main_language, blog_languages) do - ([main_language] ++ (blog_languages || [])) - |> Enum.reject(&(&1 in [nil, ""])) - |> Enum.uniq() - end - - defp route_language(main_language, language) when main_language == language, do: nil - defp route_language(_main_language, language), do: language - defp translation_lookup_map(published_translations) do Map.new(published_translations, fn translation -> {{translation.translation_for, translation.language}, translation} @@ -1553,519 +1377,6 @@ defmodule BDS.Generation do } end - defp render_home(plan, language) do - [ - "", - "", - plan.project_name, - "", - "

", - plan.project_name, - "

", - "" - ] - |> IO.iodata_to_binary() - end - - defp render_feed(plan, language, published_posts) do - items = - published_posts - |> Enum.filter(&(&1.language == language or language == plan.language)) - |> Enum.map(fn post -> - "#{xml_escape(post.title)}#{url_for_output(plan.base_url, post_output_path(post))}" - end) - |> Enum.join() - - "#{xml_escape(plan.project_name)} (#{xml_escape(language || "default")})#{items}" - end - - defp render_atom(plan, language, published_posts) do - entries = - published_posts - |> Enum.filter(&(&1.language == language or language == plan.language)) - |> Enum.map(fn post -> - "#{xml_escape(post.title)}#{url_for_output(plan.base_url, post_output_path(post))}" - end) - |> Enum.join() - - "#{xml_escape(plan.project_name)} (#{xml_escape(language || "default")})#{entries}" - end - - defp render_calendar(published_posts) do - published_posts - |> Enum.map(fn post -> - %{date: local_date_iso8601!(post.created_at), slug: post.slug, title: post.title} - end) - |> Jason.encode!() - end - - defp render_sitemap(urls) do - entries = Enum.map_join(urls, "", fn url -> "#{xml_escape(url)}" end) - "#{entries}" - end - - defp render_multi_language_sitemap( - plan, - translatable_posts, - do_not_translate_posts, - published_list_posts, - post_index, - additional_languages - ) do - all_languages = [plan.language | additional_languages] - latest_post_updated_at = latest_post_updated_at_iso(published_list_posts) - - urls = - [ - render_multi_language_sitemap_url( - url_for_path(plan.base_url, "/"), - latest_post_updated_at, - "daily", - "1.0", - build_hreflang_links(plan.base_url, "/", plan.language, all_languages) - ) - ] ++ - Enum.map(root_pagination_pages(length(published_list_posts), plan.max_posts_per_page), fn page_number -> - page_path = "/page/#{page_number}" - - render_multi_language_sitemap_url( - url_for_path(plan.base_url, page_path), - latest_post_updated_at, - "daily", - "0.9", - build_hreflang_links(plan.base_url, page_path, plan.language, all_languages) - ) - end) ++ - Enum.map(translatable_posts, fn post -> - post_path = relative_path_to_url_path(post_output_path(post)) - - render_multi_language_sitemap_url( - url_for_path(plan.base_url, post_path), - unix_ms_to_iso8601(post.updated_at), - "monthly", - "0.8", - build_hreflang_links(plan.base_url, post_path, plan.language, all_languages) - ) - end) ++ - Enum.map(do_not_translate_posts, fn post -> - post_path = relative_path_to_url_path(post_output_path(post)) - - render_multi_language_sitemap_url( - url_for_path(plan.base_url, post_path), - unix_ms_to_iso8601(post.updated_at), - "monthly", - "0.8", - build_hreflang_links(plan.base_url, post_path, plan.language, [plan.language]) - ) - end) ++ - Enum.flat_map(translatable_posts ++ do_not_translate_posts, fn post -> - if "page" in (post.categories || []) and to_string(post.slug) != "" do - page_path = relative_path_to_url_path(page_output_path(post.slug, nil)) - languages = if truthy_flag?(Map.get(post, :do_not_translate)), do: [plan.language], else: all_languages - - [ - render_multi_language_sitemap_url( - url_for_path(plan.base_url, page_path), - unix_ms_to_iso8601(post.updated_at), - "weekly", - "0.7", - build_hreflang_links(plan.base_url, page_path, plan.language, languages) - ) - ] - else - [] - end - end) ++ - Enum.map(Enum.sort_by(post_index.posts_by_year, &elem(&1, 0), :desc), fn {year, _posts} -> - year_path = "/#{year}" - - render_multi_language_sitemap_url( - url_for_path(plan.base_url, year_path), - latest_post_updated_at, - "monthly", - "0.5", - build_hreflang_links(plan.base_url, year_path, plan.language, all_languages) - ) - end) ++ - Enum.map(Enum.sort_by(post_index.posts_by_year_month, &elem(&1, 0), :desc), fn {year_month, _posts} -> - month_path = "/#{year_month}" - - render_multi_language_sitemap_url( - url_for_path(plan.base_url, month_path), - latest_post_updated_at, - "monthly", - "0.5", - build_hreflang_links(plan.base_url, month_path, plan.language, all_languages) - ) - end) ++ - Enum.map(Enum.sort_by(post_index.posts_by_year_month_day, &elem(&1, 0), :desc), fn {year_month_day, _posts} -> - day_path = "/#{year_month_day}" - - render_multi_language_sitemap_url( - url_for_path(plan.base_url, day_path), - latest_post_updated_at, - "monthly", - "0.4", - build_hreflang_links(plan.base_url, day_path, plan.language, all_languages) - ) - end) ++ - Enum.map(Enum.sort_by(post_index.posts_by_category, &elem(&1, 0)), fn {category, _posts} -> - category_path = "/category/#{archive_route_segment(category)}" - - render_multi_language_sitemap_url( - url_for_path(plan.base_url, category_path), - latest_post_updated_at, - "weekly", - "0.6", - build_hreflang_links(plan.base_url, category_path, plan.language, all_languages) - ) - end) ++ - Enum.map(Enum.sort_by(post_index.posts_by_tag, &elem(&1, 0)), fn {tag, _posts} -> - tag_path = "/tag/#{archive_route_segment(tag)}" - - render_multi_language_sitemap_url( - url_for_path(plan.base_url, tag_path), - latest_post_updated_at, - "weekly", - "0.6", - build_hreflang_links(plan.base_url, tag_path, plan.language, all_languages) - ) - end) - - [ - "", - "", - Enum.join(urls, "\n"), - "", - "" - ] - |> Enum.join("\n") - end - - defp latest_post_updated_at_iso([]), do: DateTime.utc_now() |> DateTime.to_iso8601() - defp latest_post_updated_at_iso([post | _rest]), do: unix_ms_to_iso8601(post.updated_at) - - defp root_pagination_pages(total_items, max_posts_per_page) do - case page_count(total_items, max_posts_per_page) do - total_pages when total_pages > 1 -> Enum.to_list(2..total_pages) - _other -> [] - end - end - - defp unix_ms_to_iso8601(nil), do: DateTime.utc_now() |> DateTime.to_iso8601() - defp unix_ms_to_iso8601(value), do: value |> Persistence.from_unix_ms!() |> DateTime.to_iso8601() - - defp url_for_path(nil, path), do: ensure_trailing_slash(path) - - defp url_for_path(base_url, path) do - String.trim_trailing(base_url, "/") <> ensure_trailing_slash(path) - end - - defp ensure_trailing_slash(path) do - normalized_path = normalize_url_path(path) - if normalized_path == "/", do: "/", else: normalized_path <> "/" - end - - defp build_hreflang_links(base_url, url_path, main_language, languages) do - Enum.map(languages, fn language -> - prefixed_path = - if language == main_language do - url_path - else - normalize_url_path("/#{language}#{url_path}") - end - - canonical_href = url_for_path(base_url, prefixed_path) - - " " - end) ++ - [ - " " - ] - end - - defp render_multi_language_sitemap_url(loc, lastmod, changefreq, priority, hreflang_links) do - [ - " ", - " #{xml_escape(loc)}", - " #{xml_escape(lastmod)}", - " #{changefreq}", - " #{priority}", - Enum.join(hreflang_links, "\n"), - " " - ] - |> Enum.join("\n") - end - - defp sitemap_route_output?("404.html"), do: false - defp sitemap_route_output?("feed.xml"), do: false - defp sitemap_route_output?("atom.xml"), do: false - defp sitemap_route_output?("calendar.json"), do: false - defp sitemap_route_output?(relative_path), do: String.ends_with?(relative_path, ".html") - - defp build_pagefind_outputs(plan, html_outputs) do - language_outputs = - plan.blog_languages - |> Enum.uniq() - |> Enum.flat_map(fn language -> - route_language = route_language(plan.language, language) - pages = pagefind_pages_for_language(html_outputs, route_language) - prefix = if route_language in [nil, ""], do: ["pagefind"], else: [route_language, "pagefind"] - - [ - {Path.join(prefix ++ ["index.json"]), Jason.encode!(%{"language" => language, "pages" => pages})}, - {Path.join(prefix ++ ["pagefind-ui.js"]), pagefind_ui_js(language)}, - {Path.join(prefix ++ ["pagefind-ui.css"]), pagefind_ui_css()} - ] - end) - - language_outputs - end - - defp pagefind_pages_for_language(html_outputs, route_language) do - html_outputs - |> Enum.filter(fn {relative_path, _content} -> - String.ends_with?(relative_path, ".html") and pagefind_language_match?(relative_path, route_language) - end) - |> Enum.map(fn {relative_path, content} -> - %{ - "url" => "/" <> relative_path, - "text" => pagefind_text(content) - } - end) - end - - defp pagefind_language_match?(relative_path, nil), do: not String.starts_with?(relative_path, ["de/", "fr/", "it/", "es/"]) - defp pagefind_language_match?(relative_path, ""), do: pagefind_language_match?(relative_path, nil) - defp pagefind_language_match?(relative_path, route_language), do: String.starts_with?(relative_path, route_language <> "/") - - defp pagefind_text(content) do - content - |> String.replace(~r/<[^>]+>/, " ") - |> String.replace(~r/\s+/u, " ") - |> String.trim() - end - - defp pagefind_ui_js(language) do - "window.bDSPagefind = { language: #{Jason.encode!(language)} };\n" - end - - defp pagefind_ui_css do - ".pagefind-ui{display:block;}\n" - end - - defp render_post_page(title, body, slug, language) do - [ - "", - "", - to_string(title), - "", - "
", - body, - "
", - "" - ] - |> IO.iodata_to_binary() - end - - defp render_archive_page(plan, title, posts, language, kind, pagination) do - fallback = fn -> - items = - posts - |> Enum.map(fn post -> ["
  • ", post.title, "
  • "] end) - |> IO.iodata_to_binary() - - [ - "

    ", - title, - "

    " - ] - |> IO.iodata_to_binary() - end - - render_list_output( - plan, - language, - title, - Enum.map(posts, fn post -> - %{ - id: post.id, - slug: post.slug, - title: post.title, - href: "#", - excerpt: post.excerpt, - content: nil, - language: post.language - } - end), - %{kind: kind, name: title}, - pagination, - fallback - ) - end - - defp render_date_archive_page(plan, label, archive_context, posts, language, pagination) do - fallback = fn -> - items = - posts - |> Enum.map(fn post -> ["
  • ", post.title, "
  • "] end) - |> IO.iodata_to_binary() - - [ - "

    ", - label, - "

    " - ] - |> IO.iodata_to_binary() - end - - render_list_output( - plan, - language, - label, - build_list_posts(plan.base_url, posts, route_language(plan.language, language)), - archive_context, - pagination, - fallback - ) - end - - defp load_body(_project_id, _file_path, inline_content) when is_binary(inline_content), - do: inline_content - - defp load_body(project_id, file_path, _inline_content) do - case file_path do - nil -> - "" - - "" -> - "" - - value -> - project_path = - Path.expand(value, Projects.project_data_dir(Projects.get_project!(project_id))) - - case File.read(project_path) do - {:ok, contents} -> parse_frontmatter_body(contents) - {:error, _reason} -> "" - end - end - end - - defp parse_frontmatter_body(contents) do - case String.split(contents, "\n---\n", parts: 2) do - [_frontmatter, body] -> String.trim_trailing(body, "\n") - _parts -> contents - end - end - - defp build_list_posts(base_url, posts, language_prefix) do - Enum.map(posts, fn post -> - %{ - id: post.id, - slug: post.slug, - title: post.title, - href: url_for_output(base_url, post_output_path(post, language_prefix)), - excerpt: post.excerpt, - content: load_body(post.project_id, post.file_path, post.content) - } - end) - end - - defp render_post_output(project_id, template_slug, assigns, fallback) do - case Rendering.render_post_page(project_id, template_slug, assigns) do - {:ok, rendered} -> rendered - {:error, _reason} -> fallback.() - end - end - - defp render_list_output( - %{project_id: project_id, language: main_language}, - language, - page_title, - posts, - archive_context, - pagination, - fallback - ) - when is_binary(project_id) do - case Rendering.render_list_page(project_id, %{ - language: language, - language_prefix: language_prefix(language, main_language), - page_title: page_title, - posts: posts, - archive_context: archive_context, - pagination: pagination - }) do - {:ok, rendered} -> rendered - {:error, _reason} -> fallback.() - end - end - - defp render_not_found_output(%{project_id: project_id, language: main_language}, language) - when is_binary(project_id) do - case Rendering.render_not_found_page(project_id, %{ - language: language, - language_prefix: language_prefix(language, main_language) - }) do - {:ok, rendered} -> rendered - {:error, _reason} -> render_not_found_page(language) - end - end - - defp language_prefix(language, main_language) when language == main_language, do: "" - defp language_prefix(nil, _main_language), do: "" - defp language_prefix(language, _main_language), do: "/#{language}" - - defp archive_href(language, segments, page_number) do - archive_path(language, segments, page_number) - |> String.trim_trailing("index.html") - |> then(&("/" <> String.trim_leading(&1, "/"))) - end - - defp url_for_output(nil, relative_path), do: "/" <> String.trim_leading(relative_path, "/") - - defp url_for_output(base_url, relative_path) do - cleaned = relative_path |> String.trim_leading("/") |> String.trim_trailing("index.html") - suffix = if cleaned == "", do: "/", else: "/" <> cleaned - String.trim_trailing(base_url, "/") <> suffix - end - - defp render_not_found_page(language) do - [ - "

    404

    Not Found

    " - ] - |> IO.iodata_to_binary() - end - - defp xml_escape(value) do - value - |> to_string() - |> String.replace("&", "&") - |> String.replace("<", "<") - |> String.replace(">", ">") - |> String.replace("\"", """) - |> String.replace("'", "'") - end - defp upsert_generated_file_hash(project_id, relative_path, content_hash, now) do %GeneratedFileHash{} |> GeneratedFileHash.changeset(%{ @@ -2134,8 +1445,8 @@ defmodule BDS.Generation do expected_path_set = params.sitemap_xml - |> extract_sitemap_locs() - |> Enum.map(&sitemap_loc_to_project_path(&1, params.base_url)) + |> extract_locs() + |> Enum.map(&loc_to_project_path(&1, params.base_url)) |> Enum.reduce(MapSet.new(), &MapSet.put(&2, normalize_url_path(&1))) |> then(fn expected_paths -> Enum.reduce(Map.get(params, :additional_expected_paths, []), expected_paths, fn path, acc -> @@ -2217,34 +1528,6 @@ defmodule BDS.Generation do } end - defp extract_sitemap_locs(sitemap_xml) do - Regex.scan(~r/(.*?)<\/loc>/, sitemap_xml, capture: :all_but_first) - |> Enum.map(fn [value] -> String.trim(value) end) - |> Enum.reject(&(&1 == "")) - end - - defp sitemap_loc_to_project_path(loc, nil), do: normalize_url_path(loc) - - defp sitemap_loc_to_project_path(loc, base_url) do - with {:ok, loc_uri} <- URI.new(loc), - {:ok, base_uri} <- URI.new(base_url) do - loc_path = String.trim_trailing(loc_uri.path || "/", "/") - base_path = String.trim_trailing(base_uri.path || "", "/") - - cond do - base_path != "" and String.starts_with?(loc_path, base_path) -> - loc_path - |> String.replace_prefix(base_path, "") - |> normalize_url_path() - - true -> - normalize_url_path(loc_path) - end - else - _other -> normalize_url_path(loc) - end - end - defp collect_html_index_paths(index_paths, html_dir, on_progress, total_compare_steps) do index_paths |> Enum.with_index(1) @@ -2270,56 +1553,6 @@ defmodule BDS.Generation do end) end - defp report_validation_compare_progress(nil, _current, _total), do: :ok - defp report_validation_compare_progress(_callback, _current, total) when total <= 0, do: :ok - - defp report_validation_compare_progress(callback, current, total) do - progress = min(0.99, 0.5 + current / total * 0.49) - callback.(progress, "Comparing sitemap to html pages... #{current}/#{total}") - :ok - end - - defp normalize_url_path(nil), do: "/" - - defp normalize_url_path(url_path) do - trimmed = String.trim(url_path || "") - - cond do - trimmed in ["", "/"] -> - "/" - - true -> - trimmed - |> String.split(["?", "#"]) - |> List.first() - |> to_string() - |> String.trim("/") - |> case do - "" -> "/" - value -> "/" <> value - end - end - end - - defp relative_path_to_url_path(relative_path) do - relative_path - |> String.trim_leading("/") - |> String.trim_trailing("index.html") - |> String.trim_trailing("/") - |> case do - "" -> "/" - value -> "/" <> value - end - end - - defp url_path_to_relative_index_path("/"), do: "index.html" - - defp url_path_to_relative_index_path(url_path) do - url_path - |> normalize_url_path() - |> String.trim_leading("/") - |> Path.join("index.html") - end defp mtime_ms(%{mtime: mtime}) when is_integer(mtime) do mtime * 1000 @@ -2477,17 +1710,6 @@ defmodule BDS.Generation do post.slug == route.slug and year == route.year and month == route.month and day == route.day end - defp local_date_parts!(value) do - normalized = Persistence.normalize_unix_timestamp(value) - {{year, month, day}, _time} = :calendar.system_time_to_local_time(normalized, :millisecond) - {year, month, day} - end - - defp local_date_iso8601!(value) do - {year, month, day} = local_date_parts!(value) - Date.new!(year, month, day) |> Date.to_iso8601() - end - defp route_key(year, month, day, slug) do "#{year}/#{String.pad_leading(Integer.to_string(month), 2, "0")}/#{String.pad_leading(Integer.to_string(day), 2, "0")}/#{slug}" end diff --git a/lib/bds/generation/pagefind.ex b/lib/bds/generation/pagefind.ex new file mode 100644 index 0000000..3a43ca2 --- /dev/null +++ b/lib/bds/generation/pagefind.ex @@ -0,0 +1,70 @@ +defmodule BDS.Generation.Pagefind do + @moduledoc false + + @typedoc "An (relative_path, content) HTML output tuple." + @type html_output :: {String.t(), String.t()} + + @typedoc "A (relative_path, content) generated file tuple." + @type generated_file :: {String.t(), String.t()} + + @doc """ + Build the per-language Pagefind index outputs (`pagefind/index.json`, + `pagefind/pagefind-ui.js`, `pagefind/pagefind-ui.css`) for every blog + language declared on the plan. + """ + @spec build_outputs(map(), [html_output()]) :: [generated_file()] + def build_outputs(plan, html_outputs) do + plan.blog_languages + |> Enum.uniq() + |> Enum.flat_map(fn language -> + route_language = route_language(plan.language, language) + pages = pages_for_language(html_outputs, route_language) + prefix = if route_language in [nil, ""], do: ["pagefind"], else: [route_language, "pagefind"] + + [ + {Path.join(prefix ++ ["index.json"]), Jason.encode!(%{"language" => language, "pages" => pages})}, + {Path.join(prefix ++ ["pagefind-ui.js"]), ui_js(language)}, + {Path.join(prefix ++ ["pagefind-ui.css"]), ui_css()} + ] + end) + end + + defp pages_for_language(html_outputs, route_language) do + html_outputs + |> Enum.filter(fn {relative_path, _content} -> + String.ends_with?(relative_path, ".html") and language_match?(relative_path, route_language) + end) + |> Enum.map(fn {relative_path, content} -> + %{ + "url" => "/" <> relative_path, + "text" => text(content) + } + end) + end + + defp language_match?(relative_path, nil), + do: not String.starts_with?(relative_path, ["de/", "fr/", "it/", "es/"]) + + defp language_match?(relative_path, ""), do: language_match?(relative_path, nil) + + defp language_match?(relative_path, route_language), + do: String.starts_with?(relative_path, route_language <> "/") + + defp text(content) do + content + |> String.replace(~r/<[^>]+>/, " ") + |> String.replace(~r/\s+/u, " ") + |> String.trim() + end + + defp ui_js(language) do + "window.bDSPagefind = { language: #{Jason.encode!(language)} };\n" + end + + defp ui_css do + ".pagefind-ui{display:block;}\n" + end + + defp route_language(main_language, language) when main_language == language, do: nil + defp route_language(_main_language, language), do: language +end diff --git a/lib/bds/generation/paths.ex b/lib/bds/generation/paths.ex new file mode 100644 index 0000000..c2b9304 --- /dev/null +++ b/lib/bds/generation/paths.ex @@ -0,0 +1,262 @@ +defmodule BDS.Generation.Paths do + @moduledoc false + + alias BDS.Persistence + + @typedoc "A language identifier (e.g. `\"en\"`) or `nil`/`\"\"` for the main language." + @type language :: String.t() | nil + + @doc "Output path for a published post (e.g. `2024/05/12/slug/index.html`)." + @spec post_output_path(map()) :: String.t() + def post_output_path(post), do: post_output_path(post, nil) + + @spec post_output_path(map(), language()) :: String.t() + def post_output_path(post, language) when is_map(post) do + {year, month, day} = local_date_parts!(post.created_at) + year = Integer.to_string(year) + month = month |> Integer.to_string() |> String.pad_leading(2, "0") + day = day |> Integer.to_string() |> String.pad_leading(2, "0") + + path_parts = [year, month, day, post.slug, "index.html"] + + case language do + nil -> Path.join(path_parts) + "" -> Path.join(path_parts) + value -> Path.join([value | path_parts]) + end + end + + @spec paginated_archive_paths(language(), [String.t()], non_neg_integer(), pos_integer()) :: + [String.t()] + def paginated_archive_paths(route_language, segments, total_items, max_posts_per_page) do + total_pages = page_count(total_items, max_posts_per_page) + + Enum.map(1..total_pages, fn page_number -> + archive_path(route_language, segments, page_number) + end) + end + + @spec root_route_paths(language(), non_neg_integer(), pos_integer()) :: [String.t()] + def root_route_paths(route_language, total_items, max_posts_per_page) do + total_pages = page_count(total_items, max_posts_per_page) + + Enum.map(1..total_pages, fn page_number -> + root_output_path(route_language, page_number) + end) + end + + @spec root_output_path(language(), pos_integer()) :: String.t() + def root_output_path(nil, 1), do: "index.html" + def root_output_path("", 1), do: "index.html" + def root_output_path(route_language, 1), do: Path.join(route_language, "index.html") + def root_output_path(nil, page_number), do: Path.join(["page", Integer.to_string(page_number), "index.html"]) + def root_output_path("", page_number), do: root_output_path(nil, page_number) + def root_output_path(route_language, page_number), do: Path.join([route_language, "page", Integer.to_string(page_number), "index.html"]) + + @spec page_output_path(String.t(), language()) :: String.t() + def page_output_path(slug, nil), do: Path.join([slug, "index.html"]) + def page_output_path(slug, ""), do: page_output_path(slug, nil) + def page_output_path(slug, language), do: Path.join([language, slug, "index.html"]) + + @spec pagination_for_page(pos_integer(), pos_integer(), non_neg_integer(), pos_integer(), language(), [String.t()]) :: + map() + def pagination_for_page(page_number, total_pages, total_items, items_per_page, route_language, segments) do + %{ + current_page: page_number, + total_pages: total_pages, + total_items: total_items, + items_per_page: items_per_page, + has_prev_page: page_number > 1, + prev_page_href: archive_or_root_href(route_language, segments, page_number - 1), + has_next_page: page_number < total_pages, + next_page_href: archive_or_root_href(route_language, segments, page_number + 1) + } + end + + @spec archive_or_root_href(language(), [String.t()], integer()) :: String.t() + def archive_or_root_href(_route_language, _segments, page_number) when page_number < 1, do: "" + def archive_or_root_href(route_language, [], page_number), do: root_page_href(route_language, page_number) + def archive_or_root_href(route_language, segments, page_number), do: archive_href(route_language, segments, page_number) + + @spec root_page_href(language(), integer()) :: String.t() + def root_page_href(route_language, page_number) when page_number <= 1 do + case route_language do + nil -> "/" + "" -> "/" + language -> "/#{language}/" + end + end + + def root_page_href(route_language, page_number) do + base = + case route_language do + nil -> "" + "" -> "" + language -> "/#{language}" + end + + "#{base}/page/#{page_number}/" + end + + @spec archive_href(language(), [String.t()], pos_integer()) :: String.t() + def archive_href(language, segments, page_number) do + archive_path(language, segments, page_number) + |> String.trim_trailing("index.html") + |> then(&("/" <> String.trim_leading(&1, "/"))) + end + + @spec page_count(integer(), pos_integer()) :: pos_integer() + def page_count(total_items, _max_posts_per_page) when total_items <= 0, do: 1 + + def page_count(total_items, max_posts_per_page) do + page_size = max(max_posts_per_page, 1) + div(total_items + page_size - 1, page_size) + end + + @spec paginate_posts([map()], pos_integer()) :: [[map()]] + def paginate_posts(posts, max_posts_per_page) do + case Enum.chunk_every(posts, max(max_posts_per_page, 1)) do + [] -> [[]] + chunks -> chunks + end + end + + @spec root_pagination_pages(non_neg_integer(), pos_integer()) :: [pos_integer()] + def root_pagination_pages(total_items, max_posts_per_page) do + case page_count(total_items, max_posts_per_page) do + total_pages when total_pages > 1 -> Enum.to_list(2..total_pages) + _other -> [] + end + end + + @spec archive_path(language(), [String.t()], pos_integer()) :: String.t() + def archive_path(language, segments, 1), do: archive_path(language, segments) + + def archive_path(language, segments, page_number) do + archive_path(language, segments ++ ["page", Integer.to_string(page_number)]) + end + + @spec archive_path(language(), [String.t()]) :: String.t() + def archive_path(nil, segments), do: Path.join(segments ++ ["index.html"]) + def archive_path("", segments), do: Path.join(segments ++ ["index.html"]) + + def archive_path(language, segments) do + prefix = if language in [nil, ""], do: [], else: [language] + Path.join(prefix ++ segments ++ ["index.html"]) + end + + @spec archive_route_segment(any()) :: String.t() + def archive_route_segment(nil), do: "" + def archive_route_segment(value), do: value |> to_string() |> URI.encode(&URI.char_unreserved?/1) + + @spec normalize_base_url(String.t() | nil) :: String.t() | nil + def normalize_base_url(nil), do: nil + def normalize_base_url(url), do: String.trim_trailing(url, "/") + + @spec normalize_blog_languages(String.t() | nil, [String.t()] | nil) :: [String.t()] + def normalize_blog_languages(main_language, blog_languages) do + ([main_language] ++ (blog_languages || [])) + |> Enum.reject(&(&1 in [nil, ""])) + |> Enum.uniq() + end + + @spec route_language(language(), language()) :: language() + def route_language(main_language, language) when main_language == language, do: nil + def route_language(_main_language, language), do: language + + @spec language_prefix(language(), language()) :: String.t() + def language_prefix(language, main_language) when language == main_language, do: "" + def language_prefix(nil, _main_language), do: "" + def language_prefix(language, _main_language), do: "/#{language}" + + @spec url_for_path(String.t() | nil, String.t()) :: String.t() + def url_for_path(nil, path), do: ensure_trailing_slash(path) + + def url_for_path(base_url, path) do + String.trim_trailing(base_url, "/") <> ensure_trailing_slash(path) + end + + @spec url_for_output(String.t() | nil, String.t()) :: String.t() + def url_for_output(nil, relative_path), do: "/" <> String.trim_leading(relative_path, "/") + + def url_for_output(base_url, relative_path) do + cleaned = relative_path |> String.trim_leading("/") |> String.trim_trailing("index.html") + suffix = if cleaned == "", do: "/", else: "/" <> cleaned + String.trim_trailing(base_url, "/") <> suffix + end + + @spec ensure_trailing_slash(String.t()) :: String.t() + def ensure_trailing_slash(path) do + normalized_path = normalize_url_path(path) + if normalized_path == "/", do: "/", else: normalized_path <> "/" + end + + @spec normalize_url_path(String.t() | nil) :: String.t() + def normalize_url_path(nil), do: "/" + + def normalize_url_path(url_path) do + trimmed = String.trim(url_path || "") + + cond do + trimmed in ["", "/"] -> + "/" + + true -> + trimmed + |> String.split(["?", "#"]) + |> List.first() + |> to_string() + |> String.trim("/") + |> case do + "" -> "/" + value -> "/" <> value + end + end + end + + @spec relative_path_to_url_path(String.t()) :: String.t() + def relative_path_to_url_path(relative_path) do + relative_path + |> String.trim_leading("/") + |> String.trim_trailing("index.html") + |> String.trim_trailing("/") + |> case do + "" -> "/" + value -> "/" <> value + end + end + + @spec url_path_to_relative_index_path(String.t()) :: String.t() + def url_path_to_relative_index_path("/"), do: "index.html" + + def url_path_to_relative_index_path(url_path) do + url_path + |> normalize_url_path() + |> String.trim_leading("/") + |> Path.join("index.html") + end + + @spec sitemap_route_output?(String.t()) :: boolean() + def sitemap_route_output?("404.html"), do: false + def sitemap_route_output?("feed.xml"), do: false + def sitemap_route_output?("atom.xml"), do: false + def sitemap_route_output?("calendar.json"), do: false + def sitemap_route_output?(relative_path), do: String.ends_with?(relative_path, ".html") + + @spec truthy_flag?(term()) :: boolean() + def truthy_flag?(value), do: value not in [false, nil] + + @doc "Returns the local-time `{year, month, day}` for a unix-ms-or-binary timestamp." + @spec local_date_parts!(term()) :: {integer(), integer(), integer()} + def local_date_parts!(value) do + normalized = Persistence.normalize_unix_timestamp(value) + {{year, month, day}, _time} = :calendar.system_time_to_local_time(normalized, :millisecond) + {year, month, day} + end + + @spec local_date_iso8601!(term()) :: String.t() + def local_date_iso8601!(value) do + {year, month, day} = local_date_parts!(value) + Date.new!(year, month, day) |> Date.to_iso8601() + end +end diff --git a/lib/bds/generation/progress.ex b/lib/bds/generation/progress.ex new file mode 100644 index 0000000..2af69ed --- /dev/null +++ b/lib/bds/generation/progress.ex @@ -0,0 +1,96 @@ +defmodule BDS.Generation.Progress do + @moduledoc false + + @typedoc "A 2-arity progress callback `(progress :: float(), message :: String.t()) -> any()`." + @type callback :: (float(), String.t() -> any()) | nil + + @typedoc "A 3-arity stage callback `(stage :: atom(), current :: integer(), total :: integer()) -> any()`." + @type stage_callback :: (atom(), integer(), integer() -> any()) | nil + + @doc "Extract the `:on_progress` callback from a keyword list of options." + @spec callback(keyword()) :: callback() + def callback(opts) do + case Keyword.get(opts, :on_progress) do + cb when is_function(cb, 2) -> cb + _other -> nil + end + end + + @spec report_generation_started(callback(), non_neg_integer(), String.t()) :: :ok + def report_generation_started(nil, _total, _label), do: :ok + + def report_generation_started(callback, 0, label) do + callback.(1.0, "No #{label} to process") + :ok + end + + def report_generation_started(callback, total, label) do + callback.(0.0, "Processing 0/#{total} #{label}") + :ok + end + + @spec report_generation_progress(callback(), non_neg_integer(), non_neg_integer(), String.t()) :: :ok + def report_generation_progress(nil, _current, _total, _label), do: :ok + def report_generation_progress(_callback, _current, 0, _label), do: :ok + + def report_generation_progress(callback, current, total, label) do + callback.(current / total, "Processing #{current}/#{total} #{label}") + :ok + end + + @spec report_validation_progress(callback(), float(), String.t()) :: :ok + def report_validation_progress(nil, _progress, _message), do: :ok + + def report_validation_progress(callback, progress, message) do + callback.(progress, message) + :ok + end + + @spec report_validation_snapshot_progress(callback(), atom(), non_neg_integer(), integer()) :: :ok + def report_validation_snapshot_progress(nil, _stage, _current, _total), do: :ok + + def report_validation_snapshot_progress(_callback, _stage, _current, total) + when total <= 0, + do: :ok + + def report_validation_snapshot_progress(callback, :posts, current, total) do + progress = min(0.18, current / total * 0.18) + callback.(progress, "Collecting sitemap URLs... #{current}/#{total}") + :ok + end + + def report_validation_snapshot_progress(callback, :translations, current, total) do + progress = 0.18 + min(0.12, current / total * 0.12) + callback.(progress, "Collecting sitemap URLs... #{current}/#{total}") + :ok + end + + @spec report_validation_collection_progress(callback(), non_neg_integer(), integer()) :: :ok + def report_validation_collection_progress(nil, _current, _total), do: :ok + def report_validation_collection_progress(_callback, _current, total) when total <= 0, do: :ok + + def report_validation_collection_progress(callback, current, total) do + progress = min(0.49, 0.30 + current / total * 0.19) + callback.(progress, "Collecting sitemap URLs... #{current}/#{total}") + :ok + end + + @spec report_snapshot_stage_progress(stage_callback(), atom(), non_neg_integer(), integer()) :: :ok + def report_snapshot_stage_progress(nil, _stage, _current, _total), do: :ok + def report_snapshot_stage_progress(_callback, _stage, _current, total) when total <= 0, do: :ok + + def report_snapshot_stage_progress(callback, stage, current, total) do + callback.(stage, current, total) + :ok + end + + @spec report_validation_compare_progress(callback(), non_neg_integer(), integer()) :: :ok + def report_validation_compare_progress(nil, _current, _total), do: :ok + def report_validation_compare_progress(_callback, _current, total) when total <= 0, do: :ok + + def report_validation_compare_progress(callback, current, total) do + progress = min(0.99, 0.5 + current / total * 0.49) + callback.(progress, "Comparing sitemap to html pages... #{current}/#{total}") + :ok + end +end diff --git a/lib/bds/generation/renderers.ex b/lib/bds/generation/renderers.ex new file mode 100644 index 0000000..546ecfb --- /dev/null +++ b/lib/bds/generation/renderers.ex @@ -0,0 +1,227 @@ +defmodule BDS.Generation.Renderers do + @moduledoc false + + alias BDS.Generation.Paths + alias BDS.Projects + alias BDS.Rendering + + @doc "Render the home page (HTML) using the project's template engine." + @spec render_home(map(), String.t() | nil) :: String.t() + def render_home(plan, language) do + [ + "", + "", + plan.project_name, + "", + "

    ", + plan.project_name, + "

    ", + "" + ] + |> IO.iodata_to_binary() + end + + @doc "Render a single post page using the post template (fallback to a tiny inline shell)." + @spec render_post_page(String.t(), iodata(), String.t(), String.t() | nil) :: String.t() + def render_post_page(title, body, slug, language) do + [ + "", + "", + to_string(title), + "", + "
    ", + body, + "
    ", + "" + ] + |> IO.iodata_to_binary() + end + + @doc "Render an archive page (category, tag, year) with pagination." + @spec render_archive_page(map(), String.t(), [map()], String.t() | nil, String.t(), map()) :: String.t() + def render_archive_page(plan, title, posts, language, kind, pagination) do + fallback = fn -> + items = + posts + |> Enum.map(fn post -> ["
  • ", post.title, "
  • "] end) + |> IO.iodata_to_binary() + + [ + "

    ", + title, + "

      ", + items, + "
    " + ] + |> IO.iodata_to_binary() + end + + render_list_output( + plan, + language, + title, + Enum.map(posts, fn post -> + %{ + id: post.id, + slug: post.slug, + title: post.title, + href: "#", + excerpt: post.excerpt, + content: nil, + language: post.language + } + end), + %{kind: kind, name: title}, + pagination, + fallback + ) + end + + @doc "Render a date-archive page (year/month/day) with pagination." + @spec render_date_archive_page(map(), String.t(), map(), [map()], String.t() | nil, map()) :: + String.t() + def render_date_archive_page(plan, label, archive_context, posts, language, pagination) do + fallback = fn -> + items = + posts + |> Enum.map(fn post -> ["
  • ", post.title, "
  • "] end) + |> IO.iodata_to_binary() + + [ + "

    ", + label, + "

      ", + items, + "
    " + ] + |> IO.iodata_to_binary() + end + + render_list_output( + plan, + language, + label, + build_list_posts(plan.base_url, posts, Paths.route_language(plan.language, language)), + archive_context, + pagination, + fallback + ) + end + + @doc "Try the project's post template; on error, fall back to the inline `fallback` thunk." + @spec render_post_output(String.t(), String.t() | nil, map(), (-> String.t())) :: String.t() + def render_post_output(project_id, template_slug, assigns, fallback) do + case Rendering.render_post_page(project_id, template_slug, assigns) do + {:ok, rendered} -> rendered + {:error, _reason} -> fallback.() + end + end + + @doc "Render a list/archive page through the project template, falling back to inline." + @spec render_list_output(map(), String.t() | nil, String.t(), [map()], map(), map(), (-> String.t())) :: + String.t() + def render_list_output( + %{project_id: project_id, language: main_language}, + language, + page_title, + posts, + archive_context, + pagination, + fallback + ) + when is_binary(project_id) do + case Rendering.render_list_page(project_id, %{ + language: language, + language_prefix: Paths.language_prefix(language, main_language), + page_title: page_title, + posts: posts, + archive_context: archive_context, + pagination: pagination + }) do + {:ok, rendered} -> rendered + {:error, _reason} -> fallback.() + end + end + + @doc "Render the project's 404 page via its template, falling back to a static page." + @spec render_not_found_output(map(), String.t() | nil) :: String.t() + def render_not_found_output(%{project_id: project_id, language: main_language}, language) + when is_binary(project_id) do + case Rendering.render_not_found_page(project_id, %{ + language: language, + language_prefix: Paths.language_prefix(language, main_language) + }) do + {:ok, rendered} -> rendered + {:error, _reason} -> render_not_found_page(language) + end + end + + @doc "Static fallback HTML for a 404 page." + @spec render_not_found_page(String.t() | nil) :: String.t() + def render_not_found_page(language) do + [ + "

    404

    Not Found

    " + ] + |> IO.iodata_to_binary() + end + + @doc "Build the list-of-posts payload (with hrefs and bodies) for archive/list templates." + @spec build_list_posts(String.t() | nil, [map()], String.t() | nil) :: [map()] + def build_list_posts(base_url, posts, language_prefix) do + Enum.map(posts, fn post -> + %{ + id: post.id, + slug: post.slug, + title: post.title, + href: Paths.url_for_output(base_url, Paths.post_output_path(post, language_prefix)), + excerpt: post.excerpt, + content: load_body(post.project_id, post.file_path, post.content) + } + end) + end + + @doc "Load the post body from disk (or pass-through inline content) for list rendering." + @spec load_body(String.t() | nil, String.t() | nil, String.t() | nil) :: String.t() + def load_body(_project_id, _file_path, inline_content) when is_binary(inline_content), + do: inline_content + + def load_body(project_id, file_path, _inline_content) do + case file_path do + nil -> + "" + + "" -> + "" + + value -> + project_path = + Path.expand(value, Projects.project_data_dir(Projects.get_project!(project_id))) + + case File.read(project_path) do + {:ok, contents} -> parse_frontmatter_body(contents) + {:error, _reason} -> "" + end + end + end + + defp parse_frontmatter_body(contents) do + case String.split(contents, "\n---\n", parts: 2) do + [_frontmatter, body] -> String.trim_trailing(body, "\n") + _parts -> contents + end + end +end diff --git a/lib/bds/generation/sitemap.ex b/lib/bds/generation/sitemap.ex new file mode 100644 index 0000000..506344e --- /dev/null +++ b/lib/bds/generation/sitemap.ex @@ -0,0 +1,280 @@ +defmodule BDS.Generation.Sitemap do + @moduledoc false + + alias BDS.Generation.Paths + alias BDS.Persistence + + @doc "Render a simple sitemap with a flat list of URLs." + @spec render([String.t()]) :: String.t() + def render(urls) do + entries = Enum.map_join(urls, "", fn url -> "#{xml_escape(url)}" end) + "#{entries}" + end + + @doc "Render the multilingual sitemap with hreflang alternates for the project." + @spec render_multi_language(map(), [map()], [map()], [map()], map(), [String.t()]) :: String.t() + def render_multi_language( + plan, + translatable_posts, + do_not_translate_posts, + published_list_posts, + post_index, + additional_languages + ) do + all_languages = [plan.language | additional_languages] + latest_post_updated_at = latest_post_updated_at_iso(published_list_posts) + + urls = + [ + url_entry( + Paths.url_for_path(plan.base_url, "/"), + latest_post_updated_at, + "daily", + "1.0", + build_hreflang_links(plan.base_url, "/", plan.language, all_languages) + ) + ] ++ + Enum.map(Paths.root_pagination_pages(length(published_list_posts), plan.max_posts_per_page), fn page_number -> + page_path = "/page/#{page_number}" + + url_entry( + Paths.url_for_path(plan.base_url, page_path), + latest_post_updated_at, + "daily", + "0.9", + build_hreflang_links(plan.base_url, page_path, plan.language, all_languages) + ) + end) ++ + Enum.map(translatable_posts, fn post -> + post_path = Paths.relative_path_to_url_path(Paths.post_output_path(post)) + + url_entry( + Paths.url_for_path(plan.base_url, post_path), + unix_ms_to_iso8601(post.updated_at), + "monthly", + "0.8", + build_hreflang_links(plan.base_url, post_path, plan.language, all_languages) + ) + end) ++ + Enum.map(do_not_translate_posts, fn post -> + post_path = Paths.relative_path_to_url_path(Paths.post_output_path(post)) + + url_entry( + Paths.url_for_path(plan.base_url, post_path), + unix_ms_to_iso8601(post.updated_at), + "monthly", + "0.8", + build_hreflang_links(plan.base_url, post_path, plan.language, [plan.language]) + ) + end) ++ + Enum.flat_map(translatable_posts ++ do_not_translate_posts, fn post -> + if "page" in (post.categories || []) and to_string(post.slug) != "" do + page_path = Paths.relative_path_to_url_path(Paths.page_output_path(post.slug, nil)) + + languages = + if Paths.truthy_flag?(Map.get(post, :do_not_translate)), + do: [plan.language], + else: all_languages + + [ + url_entry( + Paths.url_for_path(plan.base_url, page_path), + unix_ms_to_iso8601(post.updated_at), + "weekly", + "0.7", + build_hreflang_links(plan.base_url, page_path, plan.language, languages) + ) + ] + else + [] + end + end) ++ + Enum.map(Enum.sort_by(post_index.posts_by_year, &elem(&1, 0), :desc), fn {year, _posts} -> + year_path = "/#{year}" + + url_entry( + Paths.url_for_path(plan.base_url, year_path), + latest_post_updated_at, + "monthly", + "0.5", + build_hreflang_links(plan.base_url, year_path, plan.language, all_languages) + ) + end) ++ + Enum.map(Enum.sort_by(post_index.posts_by_year_month, &elem(&1, 0), :desc), fn {year_month, _posts} -> + month_path = "/#{year_month}" + + url_entry( + Paths.url_for_path(plan.base_url, month_path), + latest_post_updated_at, + "monthly", + "0.5", + build_hreflang_links(plan.base_url, month_path, plan.language, all_languages) + ) + end) ++ + Enum.map(Enum.sort_by(post_index.posts_by_year_month_day, &elem(&1, 0), :desc), fn {year_month_day, _posts} -> + day_path = "/#{year_month_day}" + + url_entry( + Paths.url_for_path(plan.base_url, day_path), + latest_post_updated_at, + "monthly", + "0.4", + build_hreflang_links(plan.base_url, day_path, plan.language, all_languages) + ) + end) ++ + Enum.map(Enum.sort_by(post_index.posts_by_category, &elem(&1, 0)), fn {category, _posts} -> + category_path = "/category/#{Paths.archive_route_segment(category)}" + + url_entry( + Paths.url_for_path(plan.base_url, category_path), + latest_post_updated_at, + "weekly", + "0.6", + build_hreflang_links(plan.base_url, category_path, plan.language, all_languages) + ) + end) ++ + Enum.map(Enum.sort_by(post_index.posts_by_tag, &elem(&1, 0)), fn {tag, _posts} -> + tag_path = "/tag/#{Paths.archive_route_segment(tag)}" + + url_entry( + Paths.url_for_path(plan.base_url, tag_path), + latest_post_updated_at, + "weekly", + "0.6", + build_hreflang_links(plan.base_url, tag_path, plan.language, all_languages) + ) + end) + + [ + "", + "", + Enum.join(urls, "\n"), + "", + "" + ] + |> Enum.join("\n") + end + + @doc "Render an RSS feed for the given language." + @spec render_feed(map(), String.t() | nil, [map()]) :: String.t() + def render_feed(plan, language, published_posts) do + items = + published_posts + |> Enum.filter(&(&1.language == language or language == plan.language)) + |> Enum.map(fn post -> + "#{xml_escape(post.title)}#{Paths.url_for_output(plan.base_url, Paths.post_output_path(post))}" + end) + |> Enum.join() + + "#{xml_escape(plan.project_name)} (#{xml_escape(language || "default")})#{items}" + end + + @doc "Render an Atom feed for the given language." + @spec render_atom(map(), String.t() | nil, [map()]) :: String.t() + def render_atom(plan, language, published_posts) do + entries = + published_posts + |> Enum.filter(&(&1.language == language or language == plan.language)) + |> Enum.map(fn post -> + "#{xml_escape(post.title)}#{Paths.url_for_output(plan.base_url, Paths.post_output_path(post))}" + end) + |> Enum.join() + + "#{xml_escape(plan.project_name)} (#{xml_escape(language || "default")})#{entries}" + end + + @doc "Render a JSON calendar of all published posts." + @spec render_calendar([map()]) :: String.t() + def render_calendar(published_posts) do + published_posts + |> Enum.map(fn post -> + %{date: Paths.local_date_iso8601!(post.created_at), slug: post.slug, title: post.title} + end) + |> Jason.encode!() + end + + @doc "Extract the `` values from a sitemap XML document." + @spec extract_locs(String.t()) :: [String.t()] + def extract_locs(sitemap_xml) do + Regex.scan(~r/(.*?)<\/loc>/, sitemap_xml, capture: :all_but_first) + |> Enum.map(fn [value] -> String.trim(value) end) + |> Enum.reject(&(&1 == "")) + end + + @doc "Translate a sitemap `` URL to a normalized project-relative URL path." + @spec loc_to_project_path(String.t(), String.t() | nil) :: String.t() + def loc_to_project_path(loc, nil), do: Paths.normalize_url_path(loc) + + def loc_to_project_path(loc, base_url) do + with {:ok, loc_uri} <- URI.new(loc), + {:ok, base_uri} <- URI.new(base_url) do + loc_path = String.trim_trailing(loc_uri.path || "/", "/") + base_path = String.trim_trailing(base_uri.path || "", "/") + + cond do + base_path != "" and String.starts_with?(loc_path, base_path) -> + loc_path + |> String.replace_prefix(base_path, "") + |> Paths.normalize_url_path() + + true -> + Paths.normalize_url_path(loc_path) + end + else + _other -> Paths.normalize_url_path(loc) + end + end + + @doc "Escape a string for inclusion in XML." + @spec xml_escape(term()) :: String.t() + def xml_escape(value) do + value + |> to_string() + |> String.replace("&", "&") + |> String.replace("<", "<") + |> String.replace(">", ">") + |> String.replace("\"", """) + |> String.replace("'", "'") + end + + @doc "ISO-8601 string of the most recently updated post (or now)." + @spec latest_post_updated_at_iso([map()]) :: String.t() + def latest_post_updated_at_iso([]), do: DateTime.utc_now() |> DateTime.to_iso8601() + def latest_post_updated_at_iso([post | _rest]), do: unix_ms_to_iso8601(post.updated_at) + + @doc "Convert a unix-ms (or nil) timestamp to ISO-8601." + @spec unix_ms_to_iso8601(integer() | nil) :: String.t() + def unix_ms_to_iso8601(nil), do: DateTime.utc_now() |> DateTime.to_iso8601() + def unix_ms_to_iso8601(value), do: value |> Persistence.from_unix_ms!() |> DateTime.to_iso8601() + + defp build_hreflang_links(base_url, url_path, main_language, languages) do + Enum.map(languages, fn language -> + prefixed_path = + if language == main_language do + url_path + else + Paths.normalize_url_path("/#{language}#{url_path}") + end + + canonical_href = Paths.url_for_path(base_url, prefixed_path) + + " " + end) ++ + [ + " " + ] + end + + defp url_entry(loc, lastmod, changefreq, priority, hreflang_links) do + [ + " ", + " #{xml_escape(loc)}", + " #{xml_escape(lastmod)}", + " #{changefreq}", + " #{priority}", + Enum.join(hreflang_links, "\n"), + " " + ] + |> Enum.join("\n") + end +end