• Login
    View Item 
    •   Home
    • Doctoral Degree Granting Institutions
    • SUNY Polytechnic Institute
    • SUNY Polytechnic Institute Master's Theses and Projects
    • SUNY Polytechnic Institute College of Engineering
    • View Item
    •   Home
    • Doctoral Degree Granting Institutions
    • SUNY Polytechnic Institute
    • SUNY Polytechnic Institute Master's Theses and Projects
    • SUNY Polytechnic Institute College of Engineering
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Browse

    All of SUNY Open Access RepositoryCommunitiesPublication DateAuthorsTitlesSubjectsDepartmentThis CollectionPublication DateAuthorsTitlesSubjectsDepartmentAuthor ProfilesView

    My Account

    LoginRegister

    Campus Communities in SOAR

    Alfred State CollegeBrockportBroomeCantonDownstateDutchessEmpireFarmingdaleFinger LakesFredoniaHerkimerMaritimeNew PaltzNiagaraOld WestburyOneontaOnondagaOptometryOswegoPlattsburghPurchase CollegePolytechnic InstituteSUNY Office of Workforce Development and Upward MobilitySUNY PressUpstate Medical

    Statistics

    Most Popular ItemsStatistics by CountryMost Popular Authors

    Imaginator: A Text-To-Image Model Pipeline

    • CSV
    • RefMan
    • EndNote
    • BibTex
    • RefWorks
    Thumbnail
    Name:
    Masters_Project_Brandon_Horton ...
    Size:
    16.41Mb
    Format:
    PDF
    Download
    Thumbnail
    Name:
    Brandon_Horton_Library_Release ...
    Size:
    162.2Kb
    Format:
    PDF
    Download
    Average rating
     
       votes
    Cast your vote
    You can rate an item by clicking the amount of stars they wish to award to this item. When enough users have cast their vote on this item, the average rating will also be shown.
    Star rating
     
    Your vote was cast
    Thank you for your feedback
    Author
    Horton, Brandon H.
    Keyword
    Gradio
    artificial intelligence (AI)
    text-to-image generation model
    deep learning techniques
    multi-document summarization
    Optical Character Recognition (OCR)
    Large Language Model (LLM)
    Stable diffusion
    Readers/Advisors
    Reale, Michael, Ph.D.
    Confer, Amos, Dr.
    Chiang, Chen-fu, Dr.
    Term and Year
    Spring 2024
    Date Published
    2023-12-22
    
    Metadata
    Show full item record
    URI
    http://hdl.handle.net/20.500.12648/15609
    Abstract
    This work presents a pipeline of three seperate parts that create an image taken from a passage of text; whether that be a book, or some other form of media. It utilizes Gradio, a web-app based hosting program to combine these into one pipeline.[1] It also includes a way to generate a dataset filled with optimal Stable-Diffusion prompts, utilizing chatgptv3.5-turbo-1106, for the purposes of fine-tuning or training.[2] Based on research, this may be a first-of-a-kind dataset for the field. First, it utilizes PyTesseract (TesseractOCR) and opencv2 to clean up the image and obtain plaintext from an image of a book page, or other written text. Then, the pipeline sends this plain text to a fine-tuned LLM, based on the long-t5-tglobal-xl-16384-book-summary, which is further based on the LongT5 document summarization model type, fine-tuned to produce an output that is friendly for Stable Diffusion.[12] This output can be characterized as a series of tags or short descriptors separated by a myriad of commas. Once this output is produced, it is sent to the final step in the pipeline, a Stable Diffusion model, specifically Stable Diffusion XL Turbo, which produces an image based on the summarized text.[15] In user-testing, it is fairly accurate to the original book passage. Due to limitations, and this being a first-of-a-kind project, there is no output to compare it to.
    Collections
    SUNY Polytechnic Institute College of Engineering

    entitlement

     

    DSpace software (copyright © 2002 - 2025)  DuraSpace
    Quick Guide | Contact Us
    Open Repository is a service operated by 
    Atmire NV
     

    Export search results

    The export option will allow you to export the current search results of the entered query to a file. Different formats are available for download. To export the items, click on the button corresponding with the preferred download format.

    By default, clicking on the export buttons will result in a download of the allowed maximum amount of items.

    To select a subset of the search results, click "Selective Export" button and make a selection of the items you want to export. The amount of items that can be exported at once is similarly restricted as the full export.

    After making a selection, click one of the export format buttons. The amount of items that will be exported is indicated in the bubble next to export format.