Frederik Elwert on Nostr: In a methods / #DigitalHumamities class next semester, I want to cover basic corpus ...
In a methods / #DigitalHumamities class next semester, I want to cover basic corpus creation. Especially, I’ll probably focus on #OCR/#HTR/#ATR and #WebScraping. I find it incredibly hard to find good papers that can serve as a general introduction into these topics. All I find are either practical tutorials, or very specialized papers about specific approaches. Do you have any favorite readings about how to get to a text corpus in DH in the first place? Please share!
Published at
2024-09-25 10:37:19Event JSON
{
"id": "311b1ced8b758677e58c67e52c2860f6aa24c79a8f8c76e8653f78adefc659bd",
"pubkey": "955475a70a4c00a1ce335e0d1911da6e66fd9f8a6f9cd12639c253c908a9bc75",
"created_at": 1727260639,
"kind": 1,
"tags": [
[
"t",
"ocr"
],
[
"t",
"webscraping"
],
[
"proxy",
"https://fedihum.org/@felwert/113197753280137560",
"web"
],
[
"t",
"digitalhumamities"
],
[
"proxy",
"https://fedihum.org/users/felwert/statuses/113197753280137560",
"activitypub"
],
[
"L",
"pink.momostr"
],
[
"l",
"pink.momostr.activitypub:https://fedihum.org/users/felwert/statuses/113197753280137560",
"pink.momostr"
],
[
"-"
]
],
"content": "In a methods / #DigitalHumamities class next semester, I want to cover basic corpus creation. Especially, I’ll probably focus on #OCR/#HTR/#ATR and #WebScraping. I find it incredibly hard to find good papers that can serve as a general introduction into these topics. All I find are either practical tutorials, or very specialized papers about specific approaches. Do you have any favorite readings about how to get to a text corpus in DH in the first place? Please share!",
"sig": "a157c4c3e937567e80610d10724f8de2bbd0db7e6a0daa9d08b53d066b47795486c05965d9c07df89aa45a4540e7bdecb9fb6d4a08fddfbc01bfac651ff3a6c7"
}